Sharing Hash Codes for Multiple Purposes

Size: px
Start display at page:

Download "Sharing Hash Codes for Multiple Purposes"

Transcription

1 JOURNAL OF LATEX CLASS FILES, VOL 14, NO 8, AUGUST Sharin Hash Codes for Multiple Purposes Wiktor Pronobis, Danny Panknin, Johannes Kirschnick, Vinesh Srinivasan, Wojciech Samek, Volker Markl, Manohar Kaul, Klaus-Robert Müller, and Shinichi Nakajima arxiv: v3 [statml] 1 Jun 017 Abstract Locality sensitive hashin LSH is a powerful tool for sublinear-time approximate nearest neihbor search, and a variety of hashin schemes have been proposed for different dissimilarity measures However, hash codes sinificantly depend on the dissimilarity, which prohibits users from adjustin the dissimilarity at query time In this paper, we propose multiple purpose LSH mp-lsh which shares the hash codes for different dissimilarities mp-lsh supports L, cosine, and inner product dissimilarities, and their correspondin weihted sums, where the weihts can be adjusted at query time It also allows us to modify the importance of pre-defined roups of features Thus, mp-lsh enables us, for example, to retrieve similar items to a query with the user preference taken into account, to find a similar material to a query with some properties stability, utility, etc optimized, and to turn on or off a part of multi-modal information brihtness, color, audio, text, etc in imae/video retrieval We theoretically and empirically analyze the performance of three variants of mp-lsh, and demonstrate their usefulness on real-world data sets Index Terms Locality Sensitive Hashin, Approximate Near Neihbor Search, Information Retrieval, Collaborative Filterin I INTRODUCTION Lare amounts of data are bein collected every day in the sciences and industry Analysin such truly bi data sets even by linear methods can become infeasible, thus sublinear methods such as locality sensitive hashin LSH have become an important analysis tool For some data collections, the purpose can be clearly expressed from the start, for example, text/imae/video/speech analysis or recommender systems In other cases such as dru discovery or the material enome project, the ultimate query structure to such data may still not be fully fixed In other words, measurements, simulations or observations may be recorded without bein able to spell out the full specific purpose althouh the eneral oal: better drus, more potent materials is clear Motivated by the latter case, we consider how one can use LSH schemes without definin any specific dissimilarity at the data acquisition and pre-processin phase contributed equally correspondin author Klaus-robertmueller@tu-berlinde Wikor Pronobis, Danny Panknin, Klaus-Robert Müller, and Shinichi Nakajima are with Technische Unversität Berlin, Machine Learnin Group, Marchstr 3, Berlin, Germany Johannes Kirschnick and Volker Markl are with DFKI, Lanuae Technoloy Lab, Alt-Moabit 91c, Berlin, Germany Vinesh Srinivasan and Wojciech Samek are with Fraunhofer HHI Volker Markl is with Technische Unversität Berlin, Database Systems and Information Manaement Group, Einsteinufer 17, Berlin, Germany Manohar Kaul is with IIT Hyderabad Klaus-Robert Müller is with Korea University, and with Max Planck Society Wojciech Samek, Volker Markl, Klaus-Robert Müller, and Shinichi Nakajima are with Berlin Bi Data Center, Berlin, Germany LSH, one of the key technoloies for bi data analysis, enables approximate nearest neihbor search ANNS in sublinear time [1], [] With LSH functions for a required dissimilarity measure in hand, each data sample is assined to a hash bucket in the pre-prosessin stae At runtime, ANNS with theoretical uarantees can be performed by restrictin the search to the samples that lie within the hash bucket, to which the query point is assined, alon with the samples lyin in the neihbourin buckets A challene in developin LSH without definin specific purpose is that the existin LSH schemes, desined for different dissimilarity measures, provide sinificantly different hash codes Therefore, a naive realization requires us to prepare the same number of hash tables as the number of possible taret dissimilarities, which is not realistic if we need to adjust the importance of multiple criteria In this paper, we propose three variants of multiple purpose LSH mp-lsh, which support L, cosine, and inner product IP dissimilarities, and their weihted sums, where the weihts can be adjusted at query time The first proposed method, called mp-lsh with vector aumentation mp-lsh-va, maps the data space into an aumented vector space, so that the squared-l-distance in the aumented space matches the required dissimilarity measure up to a constant This scheme can be seen as an extension of recent developments of LSH for maximum IP search MIPS [3], [4], [5], [6] The sinificant difference from the previous methods is that our method is desined to modify the dissimilarity by chanin the aumented query vector We show that mp- LSH-VA is locality sensitive for L and IP dissimilarities and their weihted sums However, its performance for the L dissimilarity is sinificantly inferior to the standard L-LSH [7] In addition, mp-lsh-va does not support the cosine-distance Our second proposed method, called mp-lsh with code concatenation mp-lsh-cc, concatenates the hash codes for L, cosine, and IP dissimilarities, and constructs a special structure, called cover tree [8], which enables efficient NNS with the weihts for the dissimilarity measures controlled by adjustin the metric in the code space Althouh mp-lsh-cc is conceptually simple and its performance is uaranteed by the oriinal LSH scheme for each dissimilarity, it is not memory efficient, which also results in increased query time Considerin the drawbacks of the aforementioned two variants led us to our final and recommended proposal, called mp-lsh with code aumentation and transformation mp- LSH-CAT It supports L, cosine, and IP dissimilarities by aumentin the hash codes, instead of the oriinal vector mp- LSH-CAT is memory efficient, since it shares most information over the hash codes for different dissimilarities, so that the

2 JOURNAL OF LATEX CLASS FILES, VOL 14, NO 8, AUGUST 015 aumentation is minimized We theoretically and empirically analyze the performance of mp-lsh methods, and demonstrate their usefulness on realworld data sets Our mp-lsh methods also allow us to modify the importance of pre-defined roups of features Adjustability of the dissimilarity measure at query time is not only useful in the absence of future analysis plans, but also applicable to multi-criteria searches The followin lists some sample applications of multi-criteria queries in diverse areas: 1 In recommender systems, suestin items which are similar to a user-provided query and also match the user s preference In material science, findin materials which are similar to a query material and also possess desired properties such as stability, conductivity, and medical utility 3 In video retrieval, we can adjust the importance of multimodal information such as brihtness, color, audio, and text at query time II BACKGROUND In this section, we briefly overview previous locality sensitive hashin LSH techniques Assume that we have a sample pool X = {x n R L } N n=1 in L-dimensional space Given a query q R L, nearest neihbor search NNS solves the followin problem: x = armin Lq, x, 1 x X where L, is a dissimilarity measure A naive approach computes the dissimilarity from the query to all samples, and then chooses the most similar samples, which takes ON time On the other hand, approximate NNS can be performed in sublinear time We define the followin three terms: Definition 1: S 0 -near neihbor For S 0 > 0, x is called S 0 -near neihbor of q, if Lq, x S 0 Definition : c-approximate nearest neihbor search Given S 0 > 0, δ > 0, and c > 1, c-approximate nearest neihbor search c-anns reports some cs 0 -near neihbor of q with probability 1 δ, if there exists an S 0 -near neihbor of q in X Definition 3: Locality sensitive hashin A family H = {h : R L K} of functions is called S 0, cs 0, p 1, p -sensitive for a dissimilarity measure L : R L R L R, if the followin two conditions hold for any q, x R L : if Lq, x S 0 then P hq = hx p 1, if Lq, x cs 0 then P hq = hx p, where P denotes the probability of the event with respect to the random draw of hash functions Note that p 1 > p is required for LSH to be useful The imae K of hash functions is typically binary or inteer The followin proposition uarantees that locality sensitive hashin LSH functions enable c-anns in sublinear time Proposition 1: [1] Given a family of S 0, cs 0, p 1, p - sensitive hash functions, there exists an alorithm for c-anns with ON ρ lo N query time and ON 1+ρ space, where lo p1 ρ = lo p < 1 Below, we introduce three LSH families Let N L µ, Σ be the L-dimensional Gaussian distribution, U L α, β be the L- dimensional uniform distribution with its support [α, β] for all dimensions, and I L be the L-dimensional identity matrix The sin function, sinz : R H { 1, 1} H, applies elementwise, ivin 1 for z h 0 and 1 for z h < 0 Likewise, the floor operator applies element-wise for a vector We denote by, the anle between two vectors Proposition : L-LSH [7] For the L-distance L L q, x = q x, the hash function h L a,bx = R 1 a x + b, where R > 0 is a fixed real number, a N L 0, I L, and b U 1 0, R, satisfies Ph L L a,b q = hl a,b x = FR L Lq, x, where FR Ld = 1 Φ R/d πr/d 1 e R/d / Here, Φz = z 1 π e y dy is the standard cumulative Gaussian Proposition 3: sin-lsh [9], [10] For the cosine-distance L cos q, x = 1 cos q, x = 1 q x q x, the hash function h sin a x = sina x, 3 where a N L 0, I L, satisfies P h sin a q = h sin a x = F sin L cos q, x, where F sin d = 1 1 π cos 1 1 d 4 Proposition 4: [6] simple-lsh Assume that the samples and the query are rescaled so that max x X x 1, q 1 For the inner product dissimilarity L ip q, x = 1 q x with which the NNS problem 1 is called maximum IP search MIPS, the asymmetric hash functions h smp q a q = h sin a q = sina q, 5 where q = q; 0, h smp x a x = h sin a x = sina x, 6 where x = x; 1 x, satisfy P h smp q a q = h smp x a x = F sin L ip q, x These three LSH methods above are standard and stateof-the-art amon the data-independent LSH schemes for each dissimilarity measure Althouh all methods involve the same random projection a x, the resultin hash codes are sinificantly different from each other III PROPOSED METHODS AND THEORY In this section, we first define the problem settin Then, we propose three LSH methods for multiple dissimilarity measures, and conduct a theoretical analysis A Problem Settin Similarly to the simple-lsh Proposition 4, we rescale the samples so that max x X x 1 We also assume

3 JOURNAL OF LATEX CLASS FILES, VOL 14, NO 8, AUGUST q 1 1 Let us assume multi-modal data, where we can separate the feature vectors into G roups, ie, q = q 1 ; ; q G, x = x 1 ; ; x G For example, each roup corresponds to monochrome, color, audio, and text features in video retrieval We also accept multiple queries {q w } W w=1 for a sinle retrieval task Our oal is to perform ANNS for the followin dissimilarity measure, which we call multiple purpose MP dissimilarity: { L mp {q w }, x = W G w=1 =1 + η w x q w x 1 qw + λ w γ w q w x 1 q w x }, 7 where γ w, η w, λ w R G + are the feature weihts such that W G w=1 =1 γw + η w + λ w = 1 In the sinle query case, where W = 1, settin γ = 1/, 0, 1/, 0,, 0, η = λ = 0,, 0 corresponds to L-NNS based on the first and the third feature roups, while settin γ = η = 0,, 0, λ = 1/, 0, 1/, 0,, 0 corresponds to MIPS on the same feature roups When we like to down-weiht the importance of sinal amplitude e, brihtness of imae of the -th feature roup, we should increase the weiht η w for the cosine-distance, and decrease the weiht γ w for the squared-l-distance Multiple queries are useful when we mix NNS and MIPS, for which the queries lie in different spaces with the same dimensionality For example, by settin γ 1 = λ = 1/4, 0, 1/4, 0,, 0, γ = η 1 = η = λ 1 = 0,, 0, we can retrieve items, which are close to the item query q 1 and match the user preference query q An important requirement for our proposal is that the weihts {γ w, η w, λ w } can be adjusted at query time B Multiple purpose LSH with Vector Aumentation mp-lsh- VA Our first method, called multiple purpose LSH with vector aumentation mp-lsh-va, is inspired by the research on asymmetric LSHs for MIPS [3], [4], [5], [6], where the query and the samples are aumented with additional entries, so that the squared-l-distance in the aumented space coincides with the taret dissimilarity up to a constant A sinificant difference of our proposal from the previous methods is that we desin the aumentation so that we can adjust the dissimilarity measure ie, the feature weihts {γ w, λ w } in Eq7 by modifyin the aumented query vector Since mp-lsh-va, unfortunately, does not support the cosine-distance, we set η w = 0 in this subsection We define the weihted sum query by q = q 1 ; ; q G = W w=1 where φ w = γ w + λ w φ w 1 q w 1 ; ; φ w We aument the queries and the samples as follows: q = q; r, x = x; y, G qw G, 1 This assumption is reasonable for L-NNS if the size of the sample pool is sufficiently lare, and the query follows the same distribution as the samples For MIPS, the norm of the query can be arbitrarily modified, and we set it to q = 1 A semicolon denotes the row-wise concatenation of vectors, like in matlab where r R M is a vector-valued function of {q w }, and y R M is a function of x We constrain the aumentation y for the sample vector so that it satisfies, for a constant c 1 1, x = c 1, ie, y = c 1 x 8 Under this constraint, the norm of any aumented sample is equal to c 1, which allows us to use sin-lsh Proposition 3 to perform L-NNS The squared-l-distance between the query and a sample in the aumented space can be expressed as q x = q x + r y + const 9 For M = 1, only the choice satisfyin Eq8 is simple-lsh for r = 0, iven in Proposition 4 We consider the case for M, and desin r and y so that Eq9 matches the MP dissimilarity 7 The aumentation that matches the MP dissimilarity is not unique Here, we introduce the followin easy construction with M = G + 3: q = q ; c q, x = x ; 0 where 10 q = x = q 1 ; ; q G }{{} W ; w=1 γw q R L x 1 ; ; x G }{{} x R L 1 ; ; W w=1 γw G }{{ ; 0; µ } x ; 1 r R G+ } ; ; x K ; ν; 1 {{} y R G+ Here, we defined µ = W G w=1 =1 γw q w, ν = c 1 x + 1 G 4 =1 x c 1 = max x X x G =1 x c = max q q, With the vector aumentation 10, Eq9 matches Eq7 up to a constant see Appendix A:, q x = c 1 + c q x = L mp {q w }, x + const The collision probability, ie, the probability that the query and the sample are iven the same code, can be analytically computed: Theorem 1: Assume that the samples are rescaled so that max x X x 1 and q w 1, w For the MP dissimilarity L mp {q w }, x, iven by Eq7, with η w = 0, w, the asymmetric hash functions ha VA q ha VA x {q w } = h sin q = sina q, a x = h sin x = sina x, where q and x are iven by Eq10, satisfy P ha VA q {q w } =h VA x a x = F 1 sin + Lmp{qw },x λ 1 c 1c a Proof Via construction, it holds that x = c 1 and q = c, and simple calculations see Appendix A ive q x =,

4 JOURNAL OF LATEX CLASS FILES, VOL 14, NO 8, AUGUST a LNNS γ = 1, λ = 0 b MIPS γ = 0, λ = 0 c Mixed γ = 05, λ = 05 Fi 1 Theoretical values ρ = lo p 1 lower is better, which indicates the LSH performance see Proposition 1 The horizontal axis indicates c for c-anns lo p λ 1 Lmp{qw },x Then, applyin Propostion 3 immediately proves the theorem lo p1 Fiure 1 depicts the theoretical value of ρ = lo p of mp- LSH-VA, computed by usin Thoerem 1, for different weiht settins for G = 1 Note that ρ determines the quality of LSH smaller is better for c-anns performance see Proposition 1 In the case for L-NNS and MIPS, the ρ values of the standard LSH methods, ie, L-LSH Proposition and simple-lsh Proposition 4, are also shown for comparison Althouh mp-lsh-va offers attractive flexibility with adjustable dissimilarity, Fiure 1 implies its inferior performance to the standard methods, especially in the L-NNS case The reason miht be a too stron asymmetry between the query and the samples: a query and a sample are far apart in the aumented space, even if they are close to each other in the oriinal space We can see this from the first G entries in r and y in Eq10, respectively Those entries for the query are non-neative, ie, r m 0 for m = 1,, G, while the correspondin entries for the sample are non-positive, ie, y m 0 for m = 1,, G We believe that there is room to improve the performance of mp-lsh-va, e, by addin constants and chanin the scales of some aumented entries, which we leave as our future work In the next subsections, we propose alternative approaches, where codes are as symmetric as possible, and down-weihtin is done by chanin the metric in the code space This effectively keeps close points in the oriinal space close in the code space C Multiple purpose LSH with Code Concatenation mp-lsh- CC Let γ W w=1 λw queries by q L = = W w=1 γw, η = W w=1 ηw, and λ =, and define the metric-wise weihted averae W w=1 γw γ q w q w, q cos = W w=1 ηw q w, q w and q ip = W w=1 λw Our second proposal, called multiple purpose LSH with code concatenation mp-lsh-cc, simply concatenates multiple LSH codes, and performs NNS under the followin distance metric at query time: D CC {q w }, x= T =1t=1 γ R π h L t q L h L x t + q cos h sin + q ip h smp q t t q cos q ip h sin t x h smp x t x, 11 where h t denotes the t-th independent draw of the correspondin LSH code for t = 1,, T The distance 11 is a multimetric, a linear combination of metrics [8], in the code space For a multi-metric, we can use the cover tree [11] for efficient exact NNS Assumin that all adjustable linear weihts are upper-bounded by 1, the cover tree expresses neihborin relation between samples, takin all possible weiht settins into account NNS is conducted by boundin the code metric for a iven weiht settin Thus, mp-lsh-cc allows selective exploration of hash buckets, so that we only need to accurately measure the distance to the samples assined to the hash buckets within a small code distance The query time complexity of the cover tree is Oκ 1 lo N, where κ is a data-dependent expansion constant [1] Another ood aspect of the cover tree is that it allows dynamic insertion and deletion of new samples, and therefore, it lends itself naturally to the streamin settin Appendix F describes further details In the pure case for L, cosine, or IP dissimilarity, the hash code of mp-lsh-cc is equivalent to the base LSH code, and therefore, the performance is uaranteed by Propositions 4, respectively However, mp-lsh-cc is not optimal in terms of memory consumption and NNS efficiency This inefficiency comes from the fact that it redundantly stores the same anular or cosine-distance information into each of the L-, sin-, and simple-lsh codes Note that the information of a vector is dominated by its anular components unless the dimensionality L is very small D Multiple purpose LSH with Code Aumentation and Transformation mp-lsh-cat Our third proposal, called multiple purpose LSH with code aumentation and transformation mp-lsh-cat, offers sinificantly less memory requirement and faster NNS than mp- LSH-CC by sharin the anular information for all considered dissimilarity measures Let q L+ip = W w=1 γw + λ w q w

5 JOURNAL OF LATEX CLASS FILES, VOL 14, NO 8, AUGUST We essentially use sin-hash functions that we aument with norm information of the data, ivin us the followin aumented codes: H CAT q {q w } = Hq L+ip ; Hq cos ; 0 G and 1 H CAT x x = Hx; Hx; j x, where 13 Hv = sina 1 v 1,, sina G v G, 14 Hv = v 1 sina 1 v 1,, v G sina G v G, jv = v 1 ; ; v G, for a partitioned vector v = v 1 ; ; v G R L and 0 G = 0; ; 0 R G Here, each entry of A = A 1,, A G R T L follows A t,l N 0, 1 For two matrices H, H R T +1 G in the transformed hash code space, we measure the distance with the followin multi-metric: D CAT H, H = G =1 + β T t=t +1 H t, H t, T α t=1 H t, H t, + γ T H T +1, H T +1,, 15 where α = q L+ip and β = q cos Althouh the hash codes consist of T +1G entries, we do not need to store all the entries, and computation can be simpler and faster by first computin the total number of collisions in the sin-lsh part 14 for = 1,, G: C v, v = T t=1 { Hv t, = Hv t, } 16 Note that this computation, which dominates the computation cost for evaluatin code distances, can be performed efficiently with bit operations With the total number of collisions 16, the metric 15 between a query set {q w } and a sample x can be expressed as D CAT H CAT q {q w }, H CAT x x = G =1 α T + x T C q L+ip, x + β T C q cos, x + γ T x 17 Given a query set, this can be computed from Hx R T G and x for = 1,, G Therefore, we only need to store the pure T G sin-bits, which is required by sin-lsh alone, and G additional float numbers Similarly to mplsh-cc, we use the cover tree for efficient NNS based on the code distance 15 In the cover tree construction, we set the metric weihts to their upper-bounds, ie, α = β = γ = 1, and measure the distance between samples by D CAT H CAT x x, H CAT x x = G =1 x x C x, x + x + x + T C x, x + T x x 18 Since the collision probability can be zero, we cannot directly apply the standard LSH theory with the ρ value uaranteein the ANNS performance Instead, we show that the metric 15 of mplsh-cat approximates the MP dissimilarity 7, and the quality of ANNS is uaranteed Theorem : For η w = 0, w, it is lim T D CAT T = 1 L mp{q w }, x + const + error, with error 0105 λ 1 + γ 1 proof is iven in Appendix C Theorem 3: For γ w = λ w = 0, w, it is lim T D CAT T with error 0105 η 1 proof is iven in Appendix D Corollary 1: with lim T D CAT T = 1 L mp{q w }, x + const + error, = L mp {q w }, x + const + error, error 041 The error with a maximum of 041 ranes one order of manitude below the MP dissimilarity havin itself a rane of 4 Note that Corollary 1 implies ood approximation for the boundary cases, squared-l-, IP- and cosine-distance, throuh mplsh-cat since they are special cases of weihts: For example, mplsh-cat approximates squared-l-distance when settin λ w = η w = 0, w The followin theorem uarantees ANNS to succeed with mp-lsh-cat for pure MIPS case with specified probability proof is iven in Appendix E: Theorem 4: Let S 0 0,, cs 0 S , and set T 48 t t 1 lo n ε, where t > t 1 depend on S 0 and c see Appendix E for details With probability larer than 1 ε 3 ε n mp-lsh- CAT uarantees c-anns with respect to L ip MIPS Note that it is straiht forward to show Theorem 4 for squared- L- and cosine-distance In Section IV, we will empirically show the ood performance of mplsh-cat in eneral cases E Memory Requirement For all LSH schemes, one can trade off the memory consumption and accuracy performance by chanin the hash bit lenth T However, the memory consumption for specific hashin schemes heavily differs from the other schemes such that a comparison of performance is inadequate for a lobally shared T In this subsection, we derive individual numbers of hashes for each scheme, iven a fixed memory budet We count the theoretically minimal number of bits required to store the hash code of one data point The two fundamental components we are confronted with are sin-hashes and discretized reals Sin-hashes can be represented by exactly one bit For the reals we choose a resolution such that their discretizations take values in a set of fixed size The

6 JOURNAL OF LATEX CLASS FILES, VOL 14, NO 8, AUGUST a LNNS γ = 1, λ = 0 b MIPS γ = 0, λ = 1 c Mixed γ = 05, λ = 05 Fi Precision recall curves hiher is better on MovieLens10M data for K = 5 and T = 56 a LNNS γ = 1, λ = 0 b MIPS γ = 0, λ = 1 c Mixed γ = 05, λ = 05 Fi 3 Precision recall curves on NetFlix data for K = 10 and T = 51 L-hash function h L a,b x = R 1 a x + b is a random variable with potentially infinite, discrete values Nevertheless we can come up with a realistic upper-bound of values the L-hash essentially takes Note that R 1 a x follows a N µ = 0, σ = R x 1 distribution and x 1 Then P R 1 a x > 4σ < 10 4 Therefore L-hash essentially takes one of 8 R discrete values stored by 3 lo R bits Namely, for R = L-hash requires 13 bits We also store the norm-part of mp-lsh-cat usin 13 bits Denote by stor CAT T the required storae of mp-lsh-cat Then stor CAT T = T CAT + 13, which we set as our fixed memory budet for a iven T CAT The baselines sin- and simple-lsh, so mp-lsh-va are pure sin-hashes, thus ivin them a budet of T sin = T smp = T VA = stor CAT T hashes As discussed above L-LSH may take T L = storcatt 13 hashes For mp-lsh-cc we allocate a third of the budet for each of the three components ivin T CC = TCC L, T sin CC, T smp CC = stor CAT T 1 39, 1 3, 1 3 This consideration is used when we compare mp-lsh-cc and mp-lsh-cat in Section IV-B IV EXPERIMENT Here, we conduct an empirical evaluation on several realworld data sets A Collaborative Filterin We first evaluate our methods on collaborative filterin data, the MovieLens10M 3 and the Netflix datasets [13] Followin the experiment in [3], [5], we applied PureSVD [14] to et L-dimensional user and item vectors, where L = 150 for MovieLens and L = 300 for Netflix We centered the samples 3 so that x X x = 0, which does not affect the L-NNS as well as the MIPS solution Reardin the L-dimensional vector as a sinle feature roup G = 1, we evaluated the performance in L-NNS W = 1, γ = 1, η = λ = 0, MIPS W = 1, γ = η = 0, λ = 1, and their weihted sum W =, γ 1 = 05, λ = 05, γ = λ 1 = η 1 = η = 0 The queries for L-NNS were chosen randomly from the items, while the queries for MIPS were chosen from the users For each query, we found its K = 1, 5, 10 nearest neihbors in terms of the MP dissimilarity 7 by linear search, and used them as the round truth We set the hash bit lenth to T = 18, 56, 51, and rank the samples items based on the Hammin distance for the baseline methods and mp-lsh-va For mp-lsh-cc and mp-lsh-cat, we rank the samples based on their code distances 11 and 15, respectively After that, we drew the precision-recall curve, defined as Precision = relevantseen k and Recall = relevantseen K for different k, where relevant seen is the number of the true K nearest neihbors that are ranked within the top k positions by the LSH methods Fiures and 3 show the results on MovieLens10M for K = 5 and T = 56 and NetFlix for K = 10 and T = 51, respectively, where each curve was averaed over 000 randomly chosen queries We observe that mp-lsh-va performs very poorly in L- NNS as bad as simple-lsh, which is not desined for L- distance, althouh it performs reasonably in MIPS On the other hand, mp-lsh-cc and mp-lsh-cat perform well for all cases Similar tendency was observed for other values of K and T Since poor performance of mp-lsh-va was shown in theory Fiure 1 and experiment Fiures and 3, we will focus on mp-lsh-cc and mp-lsh-cat in the subsequent subsections

7 JOURNAL OF LATEX CLASS FILES, VOL 14, NO 8, AUGUST TABLE I ANNS RESULTS FOR MP-LSH-CC WITH T CC = TCC L, T sin CC, T smp CC = 104, 104, 104 Recall@k Query time msec Cover Tree Construction sec Storae per sample L bytes MIPS bytes L+MIPS 5, bytes TABLE II ANNS RESULTS WITH MP-LSH-CAT WITH T CAT = 104 Recall@k Query time msec Cover Tree Construction sec Storae per sample L bytes MIPS bytes L+MIPS 5, bytes TABLE III ANNS RESULTS FOR MP-LSH-CC WITH T CC = TCC L, T sin CC, T smp CC = 7, 346, 346 Recall@k Query time msec Cover Tree Construction sec Storae per sample L bytes MIPS bytes L+MIPS 5, bytes B Computation Time in Query Search Next, we evaluate query search time and memory consumption of mp-lsh-cc and mp-lsh-cat on the texmex dataset 4 [15], which was enerated from millions of imaes by applyin the standard SIFT descriptor [16] with L = 18 Similarly to Section IV-A, we conducted experiment on L-NNS, MIPS, and their weihted sum with the same settin for the weihts γ, η, λ We constructed the cover tree with N = 10 7 samples, randomly chosen from the ANN_SIFT1B dataset The queries were chosen from the defined query set, and the query for MIPS is normalized so that q = 1 We ran the performance experiment on a machine with 48 cores 4 AMD Opteron TM 638 Processors and 51 GB main memory on Ubuntu 1045 LTS Tables I III summarize recall@k, query time, cover tree construction time, and required memory storae Here, recall@k is the recall for K = 1 and iven k All reported values, except the cover tree construction time, are averaed over 100 queries We see that mp-lsh-cc Table I and mp-lsh-cat Table II for T = 104 perform comparably well in terms of accuracy see the columns for recall@k But mp-lsh- CAT is much faster see query time and requires sinificantly less memory see storae per sample Table III shows the performance of mp-lsh-cc with equal memory requirement to mp-lsh-cat for T = 104 More specifically, we use different bit lenth for each dissimilarity measure, and set them to T CC = TCC L, T sin CC, T smp CC = 7, 346, 346, with which the memory budet is shared equally for each dissimilarity measure, accordin to Section III-E By comparin Table II and Table III, we see that mp-lsh-cc for T CC = 7, 346, 346, which uses similar memory storae per sample, ives sinificantly worse recall@k than mp-lsh-cat for T = Thus, we conclude that both mp-lsh-cc and mp-lsh-cat perform well, but we recommend the latter for the case of limited memory budet, or in applications where the query search time is crucial C Demonstration of Imae Retrieval with Mixed Queries Finally, we demonstrate the usefulness of our flexible mp- LSH in an imae retrieval task on the ILSVRC01 data set [17] We computed a feature vector for each imae by concatenatin the 4096-dimensional fc7 activations of the trained VGG16 model [18] with 10-dimensional color features 5 Since user preference vector is not available, we use classifier vectors, which are the weihts associated with the respective ImaeNet classes, as MIPS queries the entries correspondin to the color features are set to zero This simulates users who like a particular class of imaes We performed ANNS based on the MP dissimilarity by usin our mp-lsh-cat with T = 51 in the sample pool consistin of all N 1M imaes In Fiure 4a, each of the three rows consists of the query at the left end, and the correspondin top-ranked imaes In the first row, the shown black do imae was used as the L query q 1, and similar black do imaes were retrieved accordin to the L dissimilarity γ 1 = 10 and λ = 00 In the second row, the VGG16 classifier vector for trench coats was used as the MIPS query q, and imaes containin trench coats were retrieved accordin to the MIPS dissimilarity γ 1 = 00 and λ = 10 In the third row, imaes containin black trench coats were retrieved accordin to the mixed dissimilarity for γ 1 = 06 and λ = 04 Fiure 4b shows another example with a strawberry L query and 5 We computed historams on the central crop of an imae coverin 50% of the area for each rb color channel with 8 and 3 bins We normalized the historams and concatenate them

8 JOURNAL OF LATEX CLASS FILES, VOL 14, NO 8, AUGUST 015 L Query 8 L Query Mips Query Mips Query Trenchcoats Ice creams Mixed Query Mixed Query = 06 = 04 = 06 = 04 a Trench coats b Ice creams Fi 4 Imae retrieval results with mixed queries In both of a and b, the top row shows L query left end and the imaes retrieved by ANNS with mp-lsh-cat for T = 51 accordin to the L dissimilarity γ 1 = 10 and λ = 00, the second row shows MIPS query and the imaes retrieved accordin to the IP dissimilarity γ 1 = 00 and λ = 10, and the third row shows the imaes retrieved accordin to the mixed dissimilarity for γ 1 = 06 and λ = 04 the ice creams MIPS query We see that, in both examples, mplsh-cat handles the combined query well: it brins imaes that are close to the L query, and relevant to the MIPS query Other examples can be found throuh our online demo6 V C ONCLUSION makin In addition we would like to analyze the interpretability of the nonlinear query mechanism in terms of salient features that have lead to the query result A PPENDIX A D ERIVATION OF I NNER P RODUCT IN P ROOF OF T HEOREM 1 e and x e, The inner product between the aumented vectors q When queryin hue amounts of data, it becomes mandatory to increase efficiency, ie, even linear methods may be too defined in Eq10, is iven by computationally involved Hashin, in particular locality sensipw PG w w w> > e e q x = + λ q x w=1 =1 γ tive hashin LSH has become a hihly efficient workhorse PG w w that can yield answers to queries in sublinear time, such as L 1 =1 γ kq k + kx k /cosine-distance nearest neihbor search NNS or maximum inner product search MIPS While for typical applications PW PG w w> x = 1 w=1 =1 λ q the type of query has to be fixed beforehand, it is not! uncommon to query with respect to several aspects in data, w perhaps, even reweihtin this dynamically at query time w w> kq k + kx k q x + γ Our paper contributes exactly herefore, namely by proposin {z } three multiple purpose locality sensitive hashin mp-lsh w kq x k methods which enable L-/cosine-distance NNS, MIPS, and L {q w },x = kλk1 mp their weihted sums7 A user can now indeed and efficiently chane the importance of the weihts at query time without A PPENDIX B recomputin the hash functions Our paper has placed its L EMMA : I NNER P RODUCT A PPROXIMATION focus on provin the feasibilty and efficiency of the mp-lsh methods, and introducin the very interestin cover tree concept For q, x RL let which is less commonly applied in the machine learnin world PT f t1 dt q, x = T1 t=1 Hqt1 Hx for fast queryin over the defined multi-metric space Finally we provide a demonstration on the usefulness of our novel with expectation technique Future studies will extend the possibilities of mp-lsh for f 11 dq, x = EdT q, x = E Hq11 Hx further includin other types of dissimilarity measure, e, the distance from hyperplane [], and further applications with and define q> x Lq, x = 1 kqk combined queires, e, retrieval with one complex multiple purpose query, say, a pareto-front for subsequent decision Althouh a lot of hashin schemes for multi-modal data have been proposed [19], [0], [1], most of them are data-dependent, and do not offer adjustability of the importance weihts at query time Lemma 1: The followin statements hold: a: It holds that dq, x = 1 kxk 1 π ^q, x

9 JOURNAL OF LATEX CLASS FILES, VOL 14, NO 8, AUGUST b: For E x = 0105 x it is c: Let bq, x = 1 π and for Lq, x 1 it is d: It holds that Lq, x dq, x E x 19 q x q, then for Lq, x 1 it is Lq, x dq, x bq, x 1 Lq, x dq, x bq, x 1 Lq, x dq, x min{1 π Lq, x 1, E x} and for s x = 058 x, if Lq, x 1 s x, it is 1 π Lq, x 1 E x Proof a: Definin p col = 1 1 π q, x we have E Hq 11 Hx 11 = 1 x pcol x 1 pcol Proof b: = 1 x pcol 1 = 1 x 1 q, x π q x Lq, x dq, x = x q x For z = 1 4 π 1 + q, x π x max z 1 + z [ 1,1] π arccosz we obtain the maximum E x = x z 1 + π arccosz 0105 x Proof d: The inequality follows from b and c Lettin E x s x = x, π the first bound is tihter than E x, if Lq, x 1 s x Note that d T q, x dq, x as T Therefore all statements are also valid, replacin dq, x by d T q, x with T lare enouh For η w = 0, w we have W L mp {q w }, x = Recall that q L+ip APPENDIX C PROOF OF THEOREM w=1 =1 γ w = W w=1 γw q w + λ w 1 T D CAT H CAT q {q w }, H CAT x x = x λw q w x q w Therefore =1 γ x We use that + q L+ip 1 + x 1 T C q L+ip, x 1 T C q L+ip, x = T 19 where e E 1 such that T H x t Hq L+ip t t=1 x q L+ip x q L+ip + e, 1 T D CAT H CAT q {q w }, H CAT x x = = 1 =1 w=1 γ w q w =1 γ x + ql+ip 1 x q L+ip q L+ip + x e W [ ] x λw q w x + =1 q L+ip 1 W =1 w=1 γ w q w } {{ } const + =1 q L+ip x e } {{ } error = 1 L mp{q w }, x λ 1 + const + error We can bound the error-term by error max e {1,,G} E 1 q L+ip E 1 W G γ w =1 w=1 =1 q L+ip x x E 1 q L+ip 1 + λ w q w E 1 λ 1 + γ 1 APPENDIX D PROOF OF THEOREM 3 For γ w = λ w = 0, w we have L mp {q w }, x = =1 W w=1 =1 q w η w q w x q w x Recall that q cos = W w=1 ηw Therefore q w 1 T D CAT H CAT q {q w }, H CAT x x = q cos 1 1 T C q cos, x 19 =1 x q cos 1 x q cos + e q cos

10 JOURNAL OF LATEX CLASS FILES, VOL 14, NO 8, AUGUST where error = q cos =1 } {{ } const W = =1 w=1 η w =1 x q cos x + x q w =1 e q cos } {{ } error x q w + const + error = 1 L mp{q w }, x η 1 + const + error, max e {1,,G} E 1 W G η w =1 w=1 =1 q cos q w APPENDIX E PROOF OF THEOREM 4 / q w = E 1 η 1 Without loss of enerality we prove the theorem for the plain MIPS case with G = 1, W = 1 and λ = 1 Then α = 1 and the measure simplifies to D CAT H CAT q {q w }, H CAT x x = T d T q ip, x For C 1 q ip, x with µ = EC 1 q ip, x = T 1 1 π x, qip and 0 < δ 1 < 1, δ > 0 we use the followin Chernoff -bounds: P C 1 q ip, x 1 δ 1 µ { exp µ } δ 1 0 P C 1 q ip, x 1 + δ µ { exp µ } 3 min{δ, δ} 1 The approximate nearest-neihbor problem with r > 0 and c > 1 is defined as follows: If there exists an x with L ip q ip, x r then we return an x with L ip q ip, x < cr For cr > r +E 1 we can set T loarithmically dependent on the dataset size to solve the approximate nearest-neihbor problem for L ip, usin d T with constant success probability: For this we require a viable t that fulfills L ip q ip, x > cr dq ip, x > t and L ip q ip, x r dq ip, x <= t Namely set t = t1+t, where r + E 1, r 1 s 1 t 1 = 1 1 r π, r 1 s 1, 1 r, r 1 cr, cr 1 and t = 1 + cr 1 π, cr 1, 1 + s 1 cr E 1, cr 1 + s 1 In any case it is t > t 1 : First note that t 1 and t are strictly monotone increasin in r and cr, respectively It therefore suffices to show t t 1 for the lower bound t based on cr = r + E 1 Case r 1 s 1 : It is t 1 = r + E 1 and t = cr, where t 1 = r + E 1 = cr = t Case r 1 s 1, 1 E 1 ]: It is t 1 = 1 π 1 r and t = cr such that t 1 = 1 π 1 r r + E 1 = cr = t 1 π 1 r E 1 1 r s 1 r 1 s 1 Case r 1 E 1, 1]: It is t 1 = 1 π 1 r and t = 1 + π cr 1 with cr > 1 such that t 1 = 1 π 1 r π cr 1 = t Case r 1, 1+s 1 E 1 ]: It is t 1 = r and t = 1+ π cr 1 such that t 1 = r 1 + π r + E 1 1 = 1 + π cr 1 = t 1 π r 1 π 1 π E 1 + E 1 r 1 + s 1 E 1 Case r > 1 + s 1 E 1 : It is t 1 = r and t = cr E 1, where t 1 = r = cr E 1 = t Now, define δ = t dq ip, x 1 + x dq ip, x = T t dqip, x x µ For L ip q ip, x r we can lower bound the probability of d T q ip, x not exceedin the specified threshold: P d T q ip, x t = P Cq ip, x 1 δµ = 1 P Cq ip, x 1 δµ 0 1 exp { µ δ} We can show dq ip, x t 1, usin Lemma 1, c and d: Case r 1 s 1 : dq ip, x L ip q ip, x E 1 dq ip, x r + E 1 Case r 1 s 1, 1: dq ip, x L ip q ip, x 1 π 1 L ipq ip, x dq ip, x 1 π 1 L ipq ip, x 1 π 1 r = t 1 Case r 1: For L ip q ip, x 1 it is dq ip, x 1 Else dq ip, x L ip q ip, x such that dq ip, x max{1, L ip q ip, x} r = t 1 Thus we can bound and δ dqip,x t 1<t T t t 1 x µ δ µ T t t 1 16µ x 1 T t t = T t t 1 µ 4µ µ T T t t 1 16 such that P d T q ip, x t 1 exp { t t 1 } T 3,

11 JOURNAL OF LATEX CLASS FILES, VOL 14, NO 8, AUGUST For L ip q ip, x > cr we can upper bound the probability of d T q ip, x droppin below the specified threshold: P d T q ip, x t = P Cq ip, x 1 + δµ 1 { exp µ } 3 min{δ, δ } We can show dq ip, x t, usin Lemma 1, c and d: Case cr 1: For L ip q ip, x 1 it is dq ip, x 1 Else dq ip, x L ip q ip, x such that dq ip, x min{1, L ip q ip, x} cr = t Case cr 1, 1 + s 1 : L ip q ip, x dq ip, x 1 π L ipq ip, x 1 dq ip, x 1 + π L ipq ip, x 1 1 π cr 1 = t Case cr 1 + s 1 : L ip q ip, x dq ip, x E 1 dq ip, x cr E 1 = t Thus we can bound δ dqip,x t >t such that T t t x µ P d T q ip, x t exp µ T { exp min = exp t t 1 4 <1 = exp x 1 T t t µ = T t t 1, 4µ { { }} min T t t 1 1, T t t 1 48µ }} { T t t 1 1, T t t1 48 { { T3 min t t 1 4, t t 1 4 } { T 3 t t 1 4 Now, define the events = exp }} } { t t1 48 T E 1 q ip, x : either L ip q ip, x > r or d T q ip, x t E q ip : x X : L ip q ip, x > cr d T q ip, x > t 3 Assume that there exists x with L ip q ip, x r Then the alorithm is successful if both, E 1 q ip, x and E q ip hold simultaneously Let T 48 t t 1 lo n ε It is P E q ip =1 P x X : L ip q ip, x>cr, d T q ip, x t Also it holds 1 P L ip q ip, x > cr, d T q ip, x t x X } 1 n exp { t t1 48 T 1 ε P E 1 q ip, x ε 1 n3 Therefore the probability of the alorithm to perform approximate nearest neihbor search correctly is larer than P E q ip, E 1 q ip, x 1 P E q ip P E 1 q ip, x 1 ε ε n3 APPENDIX F DETAILS OF COVER TREE Here, we detail how to selectively explore the hash buckets with the code dissimilarity measure in non-increasin order The difficulty is in that the dissimilarity D is a linear combination of metrics, where the weihts are selected at query time Such a metric is referred to as a dynamic metric function or a multimetric [8] We use a tree data structure, called the cover tree [11], to index the metric space We bein the description of the cover tree by introducin the expansion constant and the base of the expansion constant Expansion Constant κ [1]: is defined as the smallest value κ ψ such that every ball in the dataset X can be covered by κ balls in X of radius equal 1/ψ Here, ψ is the base of the expansion constant Data Structure: Given a set of data points X, the cover tree T is a leveled tree where each level is associated with an inteer label i, which decreases as the tree is descended For ease of explanation, let B ψ ix denote a closed ball centered at point x with radius ψ i, ie, B ψ ix = {p X : Dp, x ψ i } At every level i of T except the root, we create a union of possibly overlappin closed balls with radius ψ i that cover or contain all the data points X The centers of this coverin set of balls are stored in nodes at level i of T Let C i denote the set of nodes at level i The cover tree T obeys the followin three invariants at all levels: 1 Nestin C i C i 1 Once a point x X is in a node in C i, then it also appears in all its successor nodes Coverin For every x C i 1, there exists a x C i where x lies inside B ψ ix, and exactly one such x is a parent of x 3 Separation For all x 1, x C i, x 1 lies outside B ψ ix and x lies outside B ψ ix 1 This structure has a space bound of ON, where N is the number of samples Construction: We use the batch construction method [11], where the cover tree T is built in a top-down fashion Initially, we pick a data point x 0 and an inteer s, such that the closed ball B ψ sx 0 is the tihtest fit that covers the entire dataset X This point x 0 is placed in a sinle node, called the root of the tree T We denote the root node as C i where i = s In order to enerate the set C i 1 of the child nodes for C i, we reedily pick a set of points includin point x 0 from C i to satisfy the Nestin invariant and enerate closed balls of radius ψ i 1 centered on them, in such a way that: a all center points lie inside B ψ ix 0 Coverin invariant, b no center point intersects with other balls of radius ψ i 1 at level i 1 Separation invariant, and c the union of these closed balls covers the entire dataset X These chosen center points form the set of nodes C i 1 Child nodes are recursively enerated from each node in C i 1, until each data point in X is the center of a closed ball and resides in a leaf node of T Note that, while we construct our cover tree, we use our distance function D with all the weihts set to 10, which upper bounds all subsequent distance metrics that depend on the queries The construction time complexity is Oκ 1 N ln N

12 JOURNAL OF LATEX CLASS FILES, VOL 14, NO 8, AUGUST To achieve a more compact cover tree, we store only element identification numbers IDs in the cover tree, and not the oriinal vectors Furthermore, we store the hash bits usin compressed representation bit-sets that reduce the storae size compared to a naive implementation down to T bits For mp-lsh-ca with G = 1, each element in the cover tree contains T bits and inteers For example, indexin a 18 dimensional vector naively requires 103 bytes, but indexin the fully aumented one requires only 4 bytes, yieldin a 977% memory savin 8 Queryin: The nearest neihbor search with a cover tree is performed as follows The search for the nearest neihbor beins at the root of the cover tree and descends level-wise On each descent, we build a candidate set C, which holds all the child nodes center points of our closed balls We then prune away centers nodes in C that cannot possibly lead to a nearest neihbor to the query point q, if we descended down them The prunin mechanism is predicated on a proven result in [11] which states that for any point x C i 1, the distance between x and any descendant x is upper bounded by ψ i Therefore, any center point whose distance from q exceeds min x C Dq, x + ψ i cannot possibly have a descendant that can replace the current closest center point to q and hence can safely be pruned We add an additional check to speedup the search by not always descendin to the leaf node The time complexity of queryin the cover tree is Oκ 1 ln N Effect of multi-metric distance while queryin: It is important to note that minimizin overlap between the closed balls on hiher levels ie, closer to the root of the cover tree can allow us to effectively prune a very lare portion of the search space and compute the nearest neihbor faster Recall that the cover tree is constructed by settin our distance function D with all the weihts set to 10 Durin queryin, we allow D to be a linear combination of metrics, where the weihts lie in the rane [0, 1], which means that the distance metric D used durin queryin always under-estimates the distances and reports lower distances Durin queryin, the cover tree s structure is still intact and all the invariant properties satisfied The main difference occurs in computation of min x C Dq, x, which is the shortest distance from a center point to the query q usin the new distance metric Interestinly, this new distance ets even smaller, thus reducin our search radius ie, min x C Dq, x + ψ i centered at q, which in turn implies that at every level we manae to prune more center points, as the overlap between the closed balls also is reduced Streamin: The cover tree lends itself naturally to the settin where nearest neihbor computations have to be performed on a stream of data points This is because the cover tree allows dynamic insertion and deletion of points The time complexity for both these operations is Oκ 6 ln N, which is faster than queryin Parameter choice: In our implementation for experiment, we set the base of expansion constant to ψ = 1, which we empirically found to work best on the texmex dataset Acknowledments: This work was supported by the German Research Foundation GRK 1589/1 by the Federal Ministry of Education and Research BMBF under the project Berlin Bi Data Center FKZ 01IS14013A REFERENCES [1] P Indyk and R Motwani, Approximate nearest neihbors: Towards removin the curse of dimensionality, in STOC, 1998, pp [] J Wan, H T Schen, J Son, and J Ji, Hashin for similarity search: A survey, arxiv:140897v1 [csds], 014 [3] A Shrivastava and P Li, Asymmetric LSH ALSH for sublinear time maximum inner product search MIPS, in NIPS, vol 7, 014 [4] Y Bachrach, Y Finkelstein, R Gilad-Bachrach, L Katzir, N Koenistein, N Nice, and U Paquet, Speedin up the Xbox recommender system usin a euclidean transformation for inner-product spaces, in Proc of RecSys, 014 [5] A Shrivastava and P Li, Improved asymmetric locality sensitive hashin ALSH for maximum inner product search MIPS, Proc of UAI, 015 [6] B Neyshabur and N Srebro, On symmetric and asymmetric lshs for inner product search, in ICML, vol 3, 015 [7] M Datar, N Immorlica, P Indyk, and V S Mirrokn, Locality-sensitive hashin scheme based on p-stable distributions, in SCG, 004, pp 53 6 [8] B Bustos, S Kreft, and T Skopal, Adaptin metric indexes for searchin in multi-metric spaces, Multimedia Tools and Applications, vol 58, no 3, pp , 01 [9] M X Goemans and D P Williamson, Improved approximation alorithms for maximum cut and satisfiability problems usin semidefinite prorammin, Journal of ACM, vol 4, no 6, pp , 1995 [10] M S Charikar, Similarity estimation techniques from roundin alorithms, in STOC, 00, pp [11] A Beyelzimer, S Kakade, and J Lanford, Cover trees for nearest neihbor, in ICML, 006, pp [1] J Heinonen, Lectures on analysis on metric spaces, ser Universitext, 001 [13] S Funk, Try this at home simon/journal/006111html, 006 [14] P Cremonesi, Y Koren, and R Turrin, Performance of recommender alorithms on top-n recommendation tasks, in Proc of RecSys, 010, pp [15] H Jéou, R Tavenard, M Douze, and L Amsale, Searchin in one billion vectors: re-rank with source codin, in ICASSP, 011, pp [16] D G Lowe, Distinctive imae features from scale-invariant keypoints, Int J Comput Vision, vol 60, no, pp , 004 [17] O Russakovsky, J Den, H Su, J Krause, S Satheesh, S Ma, Z Huan, A Karpathy, A Khosla, M Bernstein, A C Ber, and L Fei-Fei, ImaeNet Lare Scale Visual Reconition Challene, International Journal of Computer Vision IJCV, vol 115, no 3, pp 11 5, 015 [18] K Simonyan and A Zisserman, Very deep convolutional networks for lare-scale imae reconition, CoRR, vol abs/ , 014 [19] J Son, Y Yan, Z Huan, H T Schen, and J Luo, Effective multiple feature hashin for lare-scale near-duplicate video retrieval, IEEE Trans on Multimedia, vol 15, no 8, pp , 013 [0] S Moran and V Lavrenko, Reularized cross-modal hashin, in Proc of SIGIR, 015 [1] S Xu, S Wan, and YZhan, Summarizin complex events: a crossmodal solution of storylines extraction and reconstruction, in Proc of EMNLP, 013, pp [] P Jain, S Vijayanarasimhan, and K Grauman, Hashin hyperplane queries to near points with applications to lare-scale active learnin, in Advances in NIPS, We assume 4 bytes per inteer and 8 bytes per double here

On Symmetric and Asymmetric LSHs for Inner Product Search

On Symmetric and Asymmetric LSHs for Inner Product Search Behnam Neyshabur Nathan Srebro Toyota Technological Institute at Chicago, Chicago, IL 6637, USA BNEYSHABUR@TTIC.EDU NATI@TTIC.EDU Abstract We consider the problem of designing locality sensitive hashes

More information

PHY 133 Lab 1 - The Pendulum

PHY 133 Lab 1 - The Pendulum 3/20/2017 PHY 133 Lab 1 The Pendulum [Stony Brook Physics Laboratory Manuals] Stony Brook Physics Laboratory Manuals PHY 133 Lab 1 - The Pendulum The purpose of this lab is to measure the period of a simple

More information

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 13 Indexes for Multimedia Data 13 Indexes for Multimedia

More information

Multimedia Databases 1/29/ Indexes for Multimedia Data Indexes for Multimedia Data Indexes for Multimedia Data

Multimedia Databases 1/29/ Indexes for Multimedia Data Indexes for Multimedia Data Indexes for Multimedia Data 1/29/2010 13 Indexes for Multimedia Data 13 Indexes for Multimedia Data 13.1 R-Trees Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

More information

LP Rounding and Combinatorial Algorithms for Minimizing Active and Busy Time

LP Rounding and Combinatorial Algorithms for Minimizing Active and Busy Time LP Roundin and Combinatorial Alorithms for Minimizin Active and Busy Time Jessica Chan, Samir Khuller, and Koyel Mukherjee University of Maryland, Collee Park {jschan,samir,koyelm}@cs.umd.edu Abstract.

More information

Strong Interference and Spectrum Warfare

Strong Interference and Spectrum Warfare Stron Interference and Spectrum Warfare Otilia opescu and Christopher Rose WILAB Ruters University 73 Brett Rd., iscataway, J 8854-86 Email: {otilia,crose}@winlab.ruters.edu Dimitrie C. opescu Department

More information

Generalized Distance Metric as a Robust Similarity Measure for Mobile Object Trajectories

Generalized Distance Metric as a Robust Similarity Measure for Mobile Object Trajectories Generalized Distance Metric as a Robust Similarity Measure for Mobile Object rajectories Garima Pathak, Sanjay Madria Department of Computer Science University Of Missouri-Rolla Missouri-6541, USA {madrias}@umr.edu

More information

Matrix multiplication: a group-theoretic approach

Matrix multiplication: a group-theoretic approach CSG399: Gems of Theoretical Computer Science. Lec. 21-23. Mar. 27-Apr. 3, 2009. Instructor: Emanuele Viola Scribe: Ravi Sundaram Matrix multiplication: a roup-theoretic approach Given two n n matrices

More information

Lecture 14 October 16, 2014

Lecture 14 October 16, 2014 CS 224: Advanced Algorithms Fall 2014 Prof. Jelani Nelson Lecture 14 October 16, 2014 Scribe: Jao-ke Chin-Lee 1 Overview In the last lecture we covered learning topic models with guest lecturer Rong Ge.

More information

A Constant Complexity Fair Scheduler with O(log N) Delay Guarantee

A Constant Complexity Fair Scheduler with O(log N) Delay Guarantee A Constant Complexity Fair Scheduler with O(lo N) Delay Guarantee Den Pan and Yuanyuan Yan 2 Deptment of Computer Science State University of New York at Stony Brook, Stony Brook, NY 79 denpan@cs.sunysb.edu

More information

Locality Sensitive Hashing

Locality Sensitive Hashing Locality Sensitive Hashing February 1, 016 1 LSH in Hamming space The following discussion focuses on the notion of Locality Sensitive Hashing which was first introduced in [5]. We focus in the case of

More information

V DD. M 1 M 2 V i2. V o2 R 1 R 2 C C

V DD. M 1 M 2 V i2. V o2 R 1 R 2 C C UNVERSTY OF CALFORNA Collee of Enineerin Department of Electrical Enineerin and Computer Sciences E. Alon Homework #3 Solutions EECS 40 P. Nuzzo Use the EECS40 90nm CMOS process in all home works and projects

More information

Optimal Data-Dependent Hashing for Approximate Near Neighbors

Optimal Data-Dependent Hashing for Approximate Near Neighbors Optimal Data-Dependent Hashing for Approximate Near Neighbors Alexandr Andoni 1 Ilya Razenshteyn 2 1 Simons Institute 2 MIT, CSAIL April 20, 2015 1 / 30 Nearest Neighbor Search (NNS) Let P be an n-point

More information

Round-off Error Free Fixed-Point Design of Polynomial FIR Predictors

Round-off Error Free Fixed-Point Design of Polynomial FIR Predictors Round-off Error Free Fixed-Point Desin of Polynomial FIR Predictors Jarno. A. Tanskanen and Vassil S. Dimitrov Institute of Intellient Power Electronics Department of Electrical and Communications Enineerin

More information

On K-Means Cluster Preservation using Quantization Schemes

On K-Means Cluster Preservation using Quantization Schemes On K-Means Cluster Preservation usin Quantization Schemes Deepak S. Turaa Michail Vlachos Olivier Verscheure IBM T.J. Watson Research Center, Hawthorne, Y, USA IBM Zürich Research Laboratory, Switzerland

More information

1 CHAPTER 7 PROJECTILES. 7.1 No Air Resistance

1 CHAPTER 7 PROJECTILES. 7.1 No Air Resistance CHAPTER 7 PROJECTILES 7 No Air Resistance We suppose that a particle is projected from a point O at the oriin of a coordinate system, the y-axis bein vertical and the x-axis directed alon the round The

More information

Tailored Bregman Ball Trees for Effective Nearest Neighbors

Tailored Bregman Ball Trees for Effective Nearest Neighbors Tailored Bregman Ball Trees for Effective Nearest Neighbors Frank Nielsen 1 Paolo Piro 2 Michel Barlaud 2 1 Ecole Polytechnique, LIX, Palaiseau, France 2 CNRS / University of Nice-Sophia Antipolis, Sophia

More information

Asymmetric Minwise Hashing for Indexing Binary Inner Products and Set Containment

Asymmetric Minwise Hashing for Indexing Binary Inner Products and Set Containment Asymmetric Minwise Hashing for Indexing Binary Inner Products and Set Containment Anshumali Shrivastava and Ping Li Cornell University and Rutgers University WWW 25 Florence, Italy May 2st 25 Will Join

More information

Lecture 17 03/21, 2017

Lecture 17 03/21, 2017 CS 224: Advanced Algorithms Spring 2017 Prof. Piotr Indyk Lecture 17 03/21, 2017 Scribe: Artidoro Pagnoni, Jao-ke Chin-Lee 1 Overview In the last lecture we saw semidefinite programming, Goemans-Williamson

More information

Composite Quantization for Approximate Nearest Neighbor Search

Composite Quantization for Approximate Nearest Neighbor Search Composite Quantization for Approximate Nearest Neighbor Search Jingdong Wang Lead Researcher Microsoft Research http://research.microsoft.com/~jingdw ICML 104, joint work with my interns Ting Zhang from

More information

Linearized optimal power flow

Linearized optimal power flow Linearized optimal power flow. Some introductory comments The advantae of the economic dispatch formulation to obtain minimum cost allocation of demand to the eneration units is that it is computationally

More information

4 Locality-sensitive hashing using stable distributions

4 Locality-sensitive hashing using stable distributions 4 Locality-sensitive hashing using stable distributions 4. The LSH scheme based on s-stable distributions In this chapter, we introduce and analyze a novel locality-sensitive hashing family. The family

More information

Efficient method for obtaining parameters of stable pulse in grating compensated dispersion-managed communication systems

Efficient method for obtaining parameters of stable pulse in grating compensated dispersion-managed communication systems 3 Conference on Information Sciences and Systems, The Johns Hopkins University, March 12 14, 3 Efficient method for obtainin parameters of stable pulse in ratin compensated dispersion-manaed communication

More information

Nonlinear Model Reduction of Differential Algebraic Equation (DAE) Systems

Nonlinear Model Reduction of Differential Algebraic Equation (DAE) Systems Nonlinear Model Reduction of Differential Alebraic Equation DAE Systems Chuili Sun and Jueren Hahn Department of Chemical Enineerin eas A&M University Collee Station X 77843-3 hahn@tamu.edu repared for

More information

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction Chapter 11 High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction High-dimensional vectors are ubiquitous in applications (gene expression data, set of movies watched by Netflix customer,

More information

Graph Entropy, Network Coding and Guessing games

Graph Entropy, Network Coding and Guessing games Graph Entropy, Network Codin and Guessin ames Søren Riis November 25, 2007 Abstract We introduce the (private) entropy of a directed raph (in a new network codin sense) as well as a number of related concepts.

More information

Lecture 5 Processing microarray data

Lecture 5 Processing microarray data Lecture 5 Processin microarray data (1)Transform the data into a scale suitable for analysis ()Remove the effects of systematic and obfuscatin sources of variation (3)Identify discrepant observations Preprocessin

More information

A Multigrid-like Technique for Power Grid Analysis

A Multigrid-like Technique for Power Grid Analysis A Multirid-like Technique for Power Grid Analysis Joseph N. Kozhaya, Sani R. Nassif, and Farid N. Najm 1 Abstract Modern sub-micron VLSI desins include hue power rids that are required to distribute lare

More information

Linear Spectral Hashing

Linear Spectral Hashing Linear Spectral Hashing Zalán Bodó and Lehel Csató Babeş Bolyai University - Faculty of Mathematics and Computer Science Kogălniceanu 1., 484 Cluj-Napoca - Romania Abstract. assigns binary hash keys to

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models

More information

Conical Pendulum Linearization Analyses

Conical Pendulum Linearization Analyses European J of Physics Education Volume 7 Issue 3 309-70 Dean et al. Conical Pendulum inearization Analyses Kevin Dean Jyothi Mathew Physics Department he Petroleum Institute Abu Dhabi, PO Box 533 United

More information

Simple Optimization (SOPT) for Nonlinear Constrained Optimization Problem

Simple Optimization (SOPT) for Nonlinear Constrained Optimization Problem (ISSN 4-6) Journal of Science & Enineerin Education (ISSN 4-6) Vol.,, Pae-3-39, Year-7 Simple Optimization (SOPT) for Nonlinear Constrained Optimization Vivek Kumar Chouhan *, Joji Thomas **, S. S. Mahapatra

More information

Problem Set 2 Solutions

Problem Set 2 Solutions UNIVERSITY OF ALABAMA Department of Physics and Astronomy PH 125 / LeClair Sprin 2009 Problem Set 2 Solutions The followin three problems are due 20 January 2009 at the beinnin of class. 1. (H,R,&W 4.39)

More information

2.2 Differentiation and Integration of Vector-Valued Functions

2.2 Differentiation and Integration of Vector-Valued Functions .. DIFFERENTIATION AND INTEGRATION OF VECTOR-VALUED FUNCTIONS133. Differentiation and Interation of Vector-Valued Functions Simply put, we differentiate and interate vector functions by differentiatin

More information

Algorithms for Data Science: Lecture on Finding Similar Items

Algorithms for Data Science: Lecture on Finding Similar Items Algorithms for Data Science: Lecture on Finding Similar Items Barna Saha 1 Finding Similar Items Finding similar items is a fundamental data mining task. We may want to find whether two documents are similar

More information

Translations on graphs with neighborhood preservation

Translations on graphs with neighborhood preservation 1 Translations on raphs with neihborhood preservation Bastien Pasdeloup, Vincent Gripon, Nicolas Grelier, Jean-harles Vialatte, Dominique Pastor arxiv:1793859v1 [csdm] 12 Sep 217 Abstract In the field

More information

Modeling for control of a three degrees-of-freedom Magnetic. Levitation System

Modeling for control of a three degrees-of-freedom Magnetic. Levitation System Modelin for control of a three derees-of-freedom Manetic evitation System Rafael Becerril-Arreola Dept. of Electrical and Computer En. University of Toronto Manfredi Maiore Dept. of Electrical and Computer

More information

CHAPTER 2 PROCESS DESCRIPTION

CHAPTER 2 PROCESS DESCRIPTION 15 CHAPTER 2 PROCESS DESCRIPTION 2.1 INTRODUCTION A process havin only one input variable used for controllin one output variable is known as SISO process. The problem of desinin controller for SISO systems

More information

Altitude measurement for model rocketry

Altitude measurement for model rocketry Altitude measurement for model rocketry David A. Cauhey Sibley School of Mechanical Aerospace Enineerin, Cornell University, Ithaca, New York 14853 I. INTRODUCTION In his book, Rocket Boys, 1 Homer Hickam

More information

Metric Embedding of Task-Specific Similarity. joint work with Trevor Darrell (MIT)

Metric Embedding of Task-Specific Similarity. joint work with Trevor Darrell (MIT) Metric Embedding of Task-Specific Similarity Greg Shakhnarovich Brown University joint work with Trevor Darrell (MIT) August 9, 2006 Task-specific similarity A toy example: Task-specific similarity A toy

More information

A Probabilistic Analysis of Propositional STRIPS. Planning. Tom Bylander. Division of Mathematics, Computer Science, and Statistics

A Probabilistic Analysis of Propositional STRIPS. Planning. Tom Bylander. Division of Mathematics, Computer Science, and Statistics A Probabilistic Analysis of Propositional STRIPS Plannin Tom Bylander Division of Mathematics, Computer Science, and Statistics The University of Texas at San Antonio San Antonio, Texas 78249 USA bylander@riner.cs.utsa.edu

More information

Another possibility is a rotation or reflection, represented by a matrix M.

Another possibility is a rotation or reflection, represented by a matrix M. 1 Chapter 25: Planar defects Planar defects: orientation and types Crystalline films often contain internal, 2-D interfaces separatin two reions transformed with respect to one another, but with, otherwise,

More information

THE SIGNAL ESTIMATOR LIMIT SETTING METHOD

THE SIGNAL ESTIMATOR LIMIT SETTING METHOD ' THE SIGNAL ESTIMATOR LIMIT SETTING METHOD Shan Jin, Peter McNamara Department of Physics, University of Wisconsin Madison, Madison, WI 53706 Abstract A new method of backround subtraction is presented

More information

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland Available on CMS information server CMS CR-28/5 he Compact Muon Solenoid Experiment Conference Report Mailin address: CMS CERN, CH-1211 GENEVA 23, Switzerland 14 February 28 Observin Heavy op Quarks with

More information

What Causes Image Intensity Changes?

What Causes Image Intensity Changes? Ede Detection l Why Detect Edes? u Information reduction l Replace imae by a cartoon in which objects and surface markins are outlined create line drawin description l These are the most informative parts

More information

n j u = (3) b u Then we select m j u as a cross product between n j u and û j to create an orthonormal basis: m j u = n j u û j (4)

n j u = (3) b u Then we select m j u as a cross product between n j u and û j to create an orthonormal basis: m j u = n j u û j (4) 4 A Position error covariance for sface feate points For each sface feate point j, we first compute the normal û j by usin 9 of the neihborin points to fit a plane In order to create a 3D error ellipsoid

More information

REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION. Scott Rickard, Radu Balan, Justinian Rosca

REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION. Scott Rickard, Radu Balan, Justinian Rosca REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEARATION Scott Rickard, Radu Balan, Justinian Rosca Siemens Corporate Research rinceton, NJ scott.rickard,radu.balan,justinian.rosca @scr.siemens.com ABSTRACT

More information

CS246 Final Exam, Winter 2011

CS246 Final Exam, Winter 2011 CS246 Final Exam, Winter 2011 1. Your name and student ID. Name:... Student ID:... 2. I agree to comply with Stanford Honor Code. Signature:... 3. There should be 17 numbered pages in this exam (including

More information

LP Rounding and Combinatorial Algorithms for Minimizing Active and Busy Time

LP Rounding and Combinatorial Algorithms for Minimizing Active and Busy Time LP Roundin and Combinatorial Alorithms for Minimizin Active and Busy Time Jessica Chan University of Maryland Collee Park, MD, USA jschan@umiacs.umd.edu Samir Khuller University of Maryland Collee Park,

More information

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is: CS 24 Section #8 Hashing, Skip Lists 3/20/7 Probability Review Expectation (weighted average): the expectation of a random quantity X is: x= x P (X = x) For each value x that X can take on, we look at

More information

Jeffrey D. Ullman Stanford University

Jeffrey D. Ullman Stanford University Jeffrey D. Ullman Stanford University 3 We are given a set of training examples, consisting of input-output pairs (x,y), where: 1. x is an item of the type we want to evaluate. 2. y is the value of some

More information

Worst case analysis for a general class of on-line lot-sizing heuristics

Worst case analysis for a general class of on-line lot-sizing heuristics Worst case analysis for a general class of on-line lot-sizing heuristics Wilco van den Heuvel a, Albert P.M. Wagelmans a a Econometric Institute and Erasmus Research Institute of Management, Erasmus University

More information

1. According to the U.S. DOE, each year humans are. of CO through the combustion of fossil fuels (oil,

1. According to the U.S. DOE, each year humans are. of CO through the combustion of fossil fuels (oil, 1. Accordin to the U.S. DOE, each year humans are producin 8 billion metric tons (1 metric ton = 00 lbs) of CO throuh the combustion of fossil fuels (oil, coal and natural as). . Since the late 1950's

More information

Optimization of Mechanical Design Problems Using Improved Differential Evolution Algorithm

Optimization of Mechanical Design Problems Using Improved Differential Evolution Algorithm International Journal of Recent Trends in Enineerin Vol. No. 5 May 009 Optimization of Mechanical Desin Problems Usin Improved Differential Evolution Alorithm Millie Pant Radha Thanaraj and V. P. Sinh

More information

Disclaimer: This lab write-up is not

Disclaimer: This lab write-up is not Disclaimer: This lab write-up is not to be copied, in whole or in part, unless a proper reference is made as to the source. (It is stronly recommended that you use this document only to enerate ideas,

More information

arxiv: v2 [stat.ml] 13 Nov 2014

arxiv: v2 [stat.ml] 13 Nov 2014 Improved Asymmetric Locality Sensitive Hashing (ALSH) for Maximum Inner Product Search (MIPS) arxiv:4.4v [stat.ml] 3 Nov 4 Anshumali Shrivastava Department of Computer Science Computing and Information

More information

Convergence of DFT eigenvalues with cell volume and vacuum level

Convergence of DFT eigenvalues with cell volume and vacuum level Converence of DFT eienvalues with cell volume and vacuum level Sohrab Ismail-Beii October 4, 2013 Computin work functions or absolute DFT eienvalues (e.. ionization potentials) requires some care. Obviously,

More information

Similarity searching, or how to find your neighbors efficiently

Similarity searching, or how to find your neighbors efficiently Similarity searching, or how to find your neighbors efficiently Robert Krauthgamer Weizmann Institute of Science CS Research Day for Prospective Students May 1, 2009 Background Geometric spaces and techniques

More information

Wire antenna model of the vertical grounding electrode

Wire antenna model of the vertical grounding electrode Boundary Elements and Other Mesh Reduction Methods XXXV 13 Wire antenna model of the vertical roundin electrode D. Poljak & S. Sesnic University of Split, FESB, Split, Croatia Abstract A straiht wire antenna

More information

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013. The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment 1 Caramanis/Sanghavi Due: Thursday, Feb. 7, 2013. (Problems 1 and

More information

Three Spreadsheet Models Of A Simple Pendulum

Three Spreadsheet Models Of A Simple Pendulum Spreadsheets in Education (ejsie) Volume 3 Issue 1 Article 5 10-31-008 Three Spreadsheet Models Of A Simple Pendulum Jan Benacka Constantine the Philosopher University, Nitra, Slovakia, jbenacka@ukf.sk

More information

On Enabling Attribute-Based Encryption to Be Traceable against Traitors

On Enabling Attribute-Based Encryption to Be Traceable against Traitors On Enablin Attribute-Based Encryption to Be Traceable aainst Traitors Zhen Liu 1 and Duncan S. Won 2 1 Shanhai Jiao Ton University, China. liuzhen@sjtu.edu.cn 2 CryptoBLK. duncanwon@cryptoblk.io Abstract.

More information

Computer Vision Lecture 3

Computer Vision Lecture 3 Demo Haribo Classification Computer Vision Lecture 3 Linear Filters 23..24 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Code available on the class website...

More information

(a) 1m s -2 (b) 2 m s -2 (c) zero (d) -1 m s -2

(a) 1m s -2 (b) 2 m s -2 (c) zero (d) -1 m s -2 11 th Physics - Unit 2 Kinematics Solutions for the Textbook Problems One Marks 1. Which one of the followin Cartesian coordinate system is not followed in physics? 5. If a particle has neative velocity

More information

Decoupled Collaborative Ranking

Decoupled Collaborative Ranking Decoupled Collaborative Ranking Jun Hu, Ping Li April 24, 2017 Jun Hu, Ping Li WWW2017 April 24, 2017 1 / 36 Recommender Systems Recommendation system is an information filtering technique, which provides

More information

MULTIDIMENSIONAL COUPLED PHOTON-ELECTRON TRANSPORT SIMULATIONS USING NEUTRAL PARTICLE S N CODES

MULTIDIMENSIONAL COUPLED PHOTON-ELECTRON TRANSPORT SIMULATIONS USING NEUTRAL PARTICLE S N CODES Computational Medical Physics Workin Group Workshop II ep 30 Oct 3 007 University of Florida (UF) Gainesville Florida UA on CD-ROM American Nuclear ociety LaGrane Park IL (007) MULTIDIMNIONAL COUPLD PHOTON-LCTRON

More information

MODELING A METAL HYDRIDE HYDROGEN STORAGE SYSTEM. Apurba Sakti EGEE 520, Mathematical Modeling of EGEE systems Spring 2007

MODELING A METAL HYDRIDE HYDROGEN STORAGE SYSTEM. Apurba Sakti EGEE 520, Mathematical Modeling of EGEE systems Spring 2007 MODELING A METAL HYDRIDE HYDROGEN STORAGE SYSTEM Apurba Sakti EGEE 520, Mathematical Modelin of EGEE systems Sprin 2007 Table of Contents Abstract Introduction Governin Equations Enery balance Momentum

More information

Analysis of visibility level in road lighting using image processing techniques

Analysis of visibility level in road lighting using image processing techniques Scientific Research and Essays Vol. 5 (18), pp. 2779-2785, 18 September, 2010 Available online at http://www.academicjournals.or/sre ISSN 1992-2248 2010 Academic Journals Full enth Research Paper Analysis

More information

IT6502 DIGITAL SIGNAL PROCESSING Unit I - SIGNALS AND SYSTEMS Basic elements of DSP concepts of frequency in Analo and Diital Sinals samplin theorem Discrete time sinals, systems Analysis of discrete time

More information

arxiv:cs/ v1 [cs.it] 4 Feb 2007

arxiv:cs/ v1 [cs.it] 4 Feb 2007 Hih-rate, Multi-Symbol-Decodable STBCs from Clifford Alebras Sanjay Karmakar Beceem Communications Pvt Ltd, Banalore skarmakar@beceemcom B Sundar Rajan Department of ECE Indian Institute of Science, Banalore

More information

Solution sheet 9. λ < g u = 0, λ = g α 0 : u = αλ. C(u, λ) = gλ gλ = 0

Solution sheet 9. λ < g u = 0, λ = g α 0 : u = αλ. C(u, λ) = gλ gλ = 0 Advanced Finite Elements MA5337 - WS17/18 Solution sheet 9 On this final exercise sheet about variational inequalities we deal with nonlinear complementarity functions as a way to rewrite a variational

More information

6.854 Advanced Algorithms

6.854 Advanced Algorithms 6.854 Advanced Algorithms Homework Solutions Hashing Bashing. Solution:. O(log U ) for the first level and for each of the O(n) second level functions, giving a total of O(n log U ) 2. Suppose we are using

More information

5. Network Analysis. 5.1 Introduction

5. Network Analysis. 5.1 Introduction 5. Network Analysis 5.1 Introduction With the continued rowth o this country as it enters the next century comes the inevitable increase in the number o vehicles tryin to use the already overtaxed transportation

More information

Emotional Optimized Design of Electro-hydraulic Actuators

Emotional Optimized Design of Electro-hydraulic Actuators Sensors & Transducers, Vol. 77, Issue 8, Auust, pp. 9-9 Sensors & Transducers by IFSA Publishin, S. L. http://www.sensorsportal.com Emotional Optimized Desin of Electro-hydraulic Actuators Shi Boqian,

More information

Adjustment of Sampling Locations in Rail-Geometry Datasets: Using Dynamic Programming and Nonlinear Filtering

Adjustment of Sampling Locations in Rail-Geometry Datasets: Using Dynamic Programming and Nonlinear Filtering Systems and Computers in Japan, Vol. 37, No. 1, 2006 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J87-D-II, No. 6, June 2004, pp. 1199 1207 Adjustment of Samplin Locations in Rail-Geometry

More information

11. Excitons in Nanowires and Nanotubes

11. Excitons in Nanowires and Nanotubes Excitons in Nanowires and Nanotubes We have seen that confinement in quantum wells leads to enhanced excitonic effects in the optical response of semiconductors The bindin enery of the stronest bound excitons

More information

Interpreting Deep Classifiers

Interpreting Deep Classifiers Ruprecht-Karls-University Heidelberg Faculty of Mathematics and Computer Science Seminar: Explainable Machine Learning Interpreting Deep Classifiers by Visual Distillation of Dark Knowledge Author: Daniela

More information

Parameterization and Vector Fields

Parameterization and Vector Fields Parameterization and Vector Fields 17.1 Parameterized Curves Curves in 2 and 3-space can be represented by parametric equations. Parametric equations have the form x x(t), y y(t) in the plane and x x(t),

More information

Phase Diagrams: construction and comparative statics

Phase Diagrams: construction and comparative statics 1 / 11 Phase Diarams: construction and comparative statics November 13, 215 Alecos Papadopoulos PhD Candidate Department of Economics, Athens University of Economics and Business papadopalex@aueb.r, https://alecospapadopoulos.wordpress.com

More information

Teams to exploit spatial locality among agents

Teams to exploit spatial locality among agents Teams to exploit spatial locality amon aents James Parker and Maria Gini Department of Computer Science and Enineerin, University of Minnesota Email: [jparker,ini]@cs.umn.edu Abstract In many situations,

More information

Renormalization Group Theory

Renormalization Group Theory Chapter 16 Renormalization Group Theory In the previous chapter a procedure was developed where hiher order 2 n cycles were related to lower order cycles throuh a functional composition and rescalin procedure.

More information

Hash-based Indexing: Application, Impact, and Realization Alternatives

Hash-based Indexing: Application, Impact, and Realization Alternatives : Application, Impact, and Realization Alternatives Benno Stein and Martin Potthast Bauhaus University Weimar Web-Technology and Information Systems Text-based Information Retrieval (TIR) Motivation Consider

More information

What types of isometric transformations are we talking about here? One common case is a translation by a displacement vector R.

What types of isometric transformations are we talking about here? One common case is a translation by a displacement vector R. 1. Planar Defects Planar defects: orientation and types Crystalline films often contain internal, -D interfaces separatin two reions transformed with respect to one another, but with, otherwise, essentially,

More information

Journal of Inequalities in Pure and Applied Mathematics

Journal of Inequalities in Pure and Applied Mathematics Journal of Inequalities in Pure and Applied Mathematics L HOSPITAL TYPE RULES FOR OSCILLATION, WITH APPLICATIONS IOSIF PINELIS Department of Mathematical Sciences, Michian Technoloical University, Houhton,

More information

Ground Rules. PC1221 Fundamentals of Physics I. Position and Displacement. Average Velocity. Lectures 7 and 8 Motion in Two Dimensions

Ground Rules. PC1221 Fundamentals of Physics I. Position and Displacement. Average Velocity. Lectures 7 and 8 Motion in Two Dimensions PC11 Fundamentals of Physics I Lectures 7 and 8 Motion in Two Dimensions Dr Tay Sen Chuan 1 Ground Rules Switch off your handphone and paer Switch off your laptop computer and keep it No talkin while lecture

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Locality-Sensitive Hashing for Chi2 Distance

Locality-Sensitive Hashing for Chi2 Distance IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XXXX, NO. XXX, JUNE 010 1 Locality-Sensitive Hashing for Chi Distance David Gorisse, Matthieu Cord, and Frederic Precioso Abstract In

More information

Advanced Methods Development for Equilibrium Cycle Calculations of the RBWR. Andrew Hall 11/7/2013

Advanced Methods Development for Equilibrium Cycle Calculations of the RBWR. Andrew Hall 11/7/2013 Advanced Methods Development for Equilibrium Cycle Calculations of the RBWR Andrew Hall 11/7/2013 Outline RBWR Motivation and Desin Why use Serpent Cross Sections? Modelin the RBWR Axial Discontinuity

More information

v( t) g 2 v 0 sin θ ( ) ( ) g t ( ) = 0

v( t) g 2 v 0 sin θ ( ) ( ) g t ( ) = 0 PROJECTILE MOTION Velocity We seek to explore the velocity of the projectile, includin its final value as it hits the round, or a taret above the round. The anle made by the velocity vector with the local

More information

Lecture 8: Pesudorandom Generators (II) 1 Pseudorandom Generators for Bounded Computation

Lecture 8: Pesudorandom Generators (II) 1 Pseudorandom Generators for Bounded Computation Expander Graphs in Computer Science WS 2010/2011 Lecture 8: Pesudorandom Generators (II) Lecturer: He Sun 1 Pseudorandom Generators for Bounded Computation Definition 8.1 Let M be a randomized TM that

More information

The Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017

The Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017 The Kernel Trick, Gram Matrices, and Feature Extraction CS6787 Lecture 4 Fall 2017 Momentum for Principle Component Analysis CS6787 Lecture 3.1 Fall 2017 Principle Component Analysis Setting: find the

More information

5 Shallow water Q-G theory.

5 Shallow water Q-G theory. 5 Shallow water Q-G theory. So far we have discussed the fact that lare scale motions in the extra-tropical atmosphere are close to eostrophic balance i.e. the Rossby number is small. We have examined

More information

11 Free vibrations: one degree of freedom

11 Free vibrations: one degree of freedom 11 Free vibrations: one deree of freedom 11.1 A uniform riid disk of radius r and mass m rolls without slippin inside a circular track of radius R, as shown in the fiure. The centroidal moment of inertia

More information

COMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from

COMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from COMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from http://www.mmds.org Distance Measures For finding similar documents, we consider the Jaccard

More information

HEAT TRANSFER IN EXHAUST SYSTEM OF A COLD START ENGINE AT LOW ENVIRONMENTAL TEMPERATURE

HEAT TRANSFER IN EXHAUST SYSTEM OF A COLD START ENGINE AT LOW ENVIRONMENTAL TEMPERATURE THERMAL SCIENCE: Vol. 14 (2010) Suppl. pp. S219 S232 219 HEAT TRANSFER IN EXHAUST SYSTEM OF A COLD START ENGINE AT LOW ENVIRONMENTAL TEMPERATURE by Snežana D. PETKOVIĆ a Radivoje B. PEŠIĆ b b and Jovanka

More information

From Loop Groups To 2-Groups

From Loop Groups To 2-Groups From Loop Groups To 2-Groups John C. Baez Joint work with: Aaron Lauda Alissa Crans Danny Stevenson & Urs Schreiber 1 G f 1 f D 2 More details at: http://math.ucr.edu/home/baez/2roup/ 1 Hiher Gaue Theory

More information

FLUID flow in a slightly inclined rectangular open

FLUID flow in a slightly inclined rectangular open Numerical Solution of de St. Venant Equations with Controlled Global Boundaries Between Unsteady Subcritical States Aldrin P. Mendoza, Adrian Roy L. Valdez, and Carlene P. Arceo Abstract This paper aims

More information

Levenberg-Marquardt-based OBS Algorithm using Adaptive Pruning Interval for System Identification with Dynamic Neural Networks

Levenberg-Marquardt-based OBS Algorithm using Adaptive Pruning Interval for System Identification with Dynamic Neural Networks Proceedins of the 29 IEEE International Conference on Systems, Man, and Cybernetics San Antonio, X, USA - October 29 evenber-marquardt-based OBS Alorithm usin Adaptive Prunin Interval for System Identification

More information

Deterministic Algorithm Computing All Generators: Application in Cryptographic Systems Design

Deterministic Algorithm Computing All Generators: Application in Cryptographic Systems Design Int. J. Communications, Networ and System Sciences, 0, 5, 75-79 http://dx.doi.or/0.436/ijcns.0.5074 Published Online November 0 (http://www.scirp.or/journal/ijcns) Deterministic Alorithm Computin All Generators:

More information

Stochastic simulations of genetic switch systems

Stochastic simulations of genetic switch systems Stochastic simulations of enetic switch systems Adiel Loiner, 1 Azi Lipshtat, 2 Nathalie Q. Balaban, 1 and Ofer Biham 1 1 Racah Institute of Physics, The Hebrew University, Jerusalem 91904, Israel 2 Department

More information