Optimal Spatial Dominance: An Effective Search of Nearest Neighbor Candidates
|
|
- Camilla Bond
- 6 years ago
- Views:
Transcription
1 Optimal Spatial Dominance: An Effective Search of Nearest Neighbor Candidates X I A O YA N G W A N G 1, Y I N G Z H A N G 2, W E N J I E Z H A N G 1, X U E M I N L I N 1, M U H A M M A D A A M I R C H E E M A 3 1. T H E U N I V E R S I T Y O F N E W S O U T H WA L E S, A U S T R A L I A 2. U N I V E R S I T Y O F T E C H N O L O G Y, S Y D N E Y, A U S T R A L I A 3. M O N A S H U N I V E R S I T Y, A U S T R A L I A
2 Outline Introduction Related Work Spatial Dominance Operators Experiments Conclusion 2
3 Introduction Objects with multiple instances are widely used. Uncertain object (instances are exclusive), e.g., uncertain spatial objects. Multi-valued object (instances are co-occurrence), e.g., NBA player records. Player C a a c Player D longitude a a rebound d c c d b d c latitude score 3
4 Introduction Nearest Neighbour (NN) search: Given a query object Q, return the nearest object to the query. Easy for objects with single instance when metric is given. Query L2 4
5 Introduction Nearest Neighbour (NN) search: Given a query object Q, return the nearest object to the query. Easy for objects with single instance when metric is given. Query L2 Query L2 5
6 Introduction Nearest Neighbour (NN) search: Given a query object Q, return the nearest object to the query. Easy for objects with single instance when metric is given. Query L2 Query L2 6
7 Introduction Nearest Neighbour (NN) search: Given a query object Q, return the nearest object to the query. There are a lot of NN ranking functions for objects with multiple instances. Min/Max, Expected, Quantile. NN probability, Expected rank. Earth Mover s, Netflow. 7
8 Introduction Nearest Neighbour (NN) search: Given a query object Q, return the nearest object to the query. There are a lot of NN ranking functions for objects with multiple instances. N 1 : all pairs based NN for multi-valued or uncertain object E.g., Min/Max, Expected, Quantile. N 2 : possible world based NN for uncertain object E.g., NN probability, Expected rank. N 3 : selected pairs based NN for multi-valued or uncertain object E.g., Earth Mover s, Netflow. 8
9 Introduction Nearest Neighbour (NN) search: Given a query object Q, return the nearest object to the query. There are a lot of NN ranking functions for objects with multiple instances. N 1 : all pairs based NN for multi-valued or uncertain object E.g., Min/Max, Expected, Quantile. N 2 : possible world based NN for uncertain object E.g., NN probability, Expected rank. N 3 : selected pairs based NN for multi-valued or uncertain object E.g., Earth Mover s, Netflow. 9
10 Introduction Nearest Neighbour (NN) search: Given a query object Q, return the nearest object to the query. There are a lot of NN ranking functions for objects with multiple instances. 1. Expected (all pairs based NN) q a a b A Q a 1 q 1 a 1 q 2 a 2 q 1 a 2 q E dis A, Q = = 5.75 b 1 q 1 b 1 q 2 b 2 q 1 b 2 q 2 q B Q E dis B, Q = =
11 Introduction Nearest Neighbour (NN) search: Given a query object Q, return the nearest object to the query. There are a lot of NN ranking functions for objects with multiple instances. 2. NN probability (possible world based NN) a a q b A Q B Q a 1 q 1 a 1 q 2 a 2 q 1 a 2 q b 1 q 1 b 1 q 2 b 2 q 1 b 2 q q
12 Introduction Nearest Neighbour (NN) search: Given a query object Q, return the nearest object to the query. There are a lot of NN ranking functions for objects with multiple instances. 2. NN probability (possible world based NN) a a q q b A Q B Q a 1 q 1 a 1 q 2 a 2 q 1 a 2 q b 1 q 1 b 1 q 2 b 2 q 1 b 2 q W 1 : a 1 b 1 q 1 -> p a,w1 = 1/8, p b,w1 = 0 12
13 Introduction Nearest Neighbour (NN) search: Given a query object Q, return the nearest object to the query. There are a lot of NN ranking functions for objects with multiple instances. 2. NN probability (possible world based NN) a a q q b A Q B Q a 1 q 1 a 1 q 2 a 2 q 1 a 2 q b 1 q 1 b 1 q 2 b 2 q 1 b 2 q W 1 : a 1 b 1 q 1 -> p a,w1 = 1/8, p b,w1 = 0 W 2 : a 2 b 1 q 1 -> p a,w2 = 0, p b,w2 = 1/8 13
14 Introduction Nearest Neighbour (NN) search: Given a query object Q, return the nearest object to the query. There are a lot of NN ranking functions for objects with multiple instances. 2. NN probability (possible world based NN) a a q q b A Q B Q a 1 q 1 a 1 q 2 a 2 q 1 a 2 q b 1 q 1 b 1 q 2 b 2 q 1 b 2 q W 1 : a 1 b 1 q 1 -> p a,w1 = 1/8, p b,w1 = 0 W 2 : a 2 b 1 q 1 -> p a,w2 = 0, p b,w2 = 1/8 => W 8 : a 2 b 2 q 2 -> p a,w8 = 1/8, p b,w8 = 0 NN p A, Q = p a,wi = 6/8 NN p B, Q = p b,wi = 2/8 14
15 Introduction Nearest Neighbour (NN) search: Given a query object Q, return the nearest object to the query. There are a lot of NN ranking functions for objects with multiple instances. 3. Earth Mover s (selected pairs based NN ) a A Q a 1 q 1 a 1 q 2 a 2 q 1 a 2 q a b a 1 1, 0.5 q 1 q q a 2 9, 0.5 q 2 15
16 Introduction Nearest Neighbour (NN) search: Given a query object Q, return the nearest object to the query. There are a lot of NN ranking functions for objects with multiple instances. 3. Earth Mover s (selected pairs based NN ) a A Q a 1 q 1 a 1 q 2 a 2 q 1 a 2 q a b a 1 1, 0.5 q 1 q => EMD(A, Q) = 1*0.5+9*0.5 = 5 q a 2 9, 0.5 q 2 16
17 Introduction There are a lot of NN ranking functions for objects with multiple instances. The parameters in NN function can be set to infinite value, e.g., quantile. 17
18 Introduction There are a lot of NN ranking functions for objects with multiple instances. The parameters in NN function can be set to infinite value, e.g., quantile. Motivation A user may not have a specific NN function in mind, it is desirable to provide her with a set of NN candidates. 18
19 Introduction A spatial dominate (SD) operator is used to define the partial order between objects regarding a query Q, i.e., SD(A, B, Q) means A dominates B, otherwise SD(A, B, Q). Given a SD operator, the NN candidate set consists of the objects that are not dominated by any other objects. Optimal SD operator w.r.t a family F of NN functions Correctness: SD(A, B, Q) f F, that f (A) f (B) Completeness: SD(A, B, Q) f F, that f (A) > f (B) 19
20 Related Work Full dominance operator (F-SD) We have FSD(A,B,Q), if and only if q Q, max A, q min (B, q) (SIGMOD10, SIGMOD14) Cheng Long, et. al, The Hong Kong University of Science and Technology Tobias Emrich, et. al, Institute for Informatics, Ludwig-Maximilians-Universität München FSD B, A, Q FSD(B, C, Q) C B c1 q1 Q Object C b1 A Over pessimistic: contain many redundant objects (no completeness property) 20
21 Related Work NN Core (TKDE10) Sze Man Yuen, et. al, Chinese University of Hong Kong A B: if A has higher chance to be closer to the query than B; NN candidates : Minimal set of objects, each of which beats any object not in the NN candidates. NN-Core A Object C a q c b c B C b a Too aggressive: violate correctness property 21
22 Related Work Stochastic order has been widely used in various domains to compare the goodness of two random variables distributions. Stochastic Order. Given two independent random variables X and Y, we say X is smaller than Y in usual stochastic order, denoted by X st Y, if Pr(X λ) Pr(Y λ) for every λ R. A Q a 1 q 1 a 1 q 2 a 2 q 1 a 2 q B Q b 1 q 1 b 1 q 2 b 2 q 1 b 2 q Pr A Q 1.5 = 0.25; Pr B Q 1.5 = 0 Pr A Q 10 = 1; Pr B Q 10 =
23 Problem Definition Objects with multiple instances An object U consists of a set {u i } of instances, and a discrete probability mass function assigns each instance u i a probability value, denoted by p(u i ), where p u i = 1. The multi-valued objects are treated as discrete random variable if their weight can be normalized. Problem statement Devise spatial dominance (SD) operators by carefully considering various NN function families. 23
24 Classify NN Functions N 1 : all pairs based NN Aggregation over pair-wise s g(a Q ), e.g., Min/Max, Expected, Quantile. N 2 : possible world based NN Aggregation over score on each possible world g(a W ), e.g., NN probability, Expected rank. N 3 : selected pairs based NN Aggregation over selected pairs g(s(a Q )), e.g., Earth Mover s, Netflow. 24
25 SD Operator Stochastic-SD (S-SD, opt. w.r.t N 1 ) Given two objects U and V, and the query Q, we have SSD(U,V,Q) if and only if U Q st V Q and U Q V Q. Object C A Q a 1 q 1 a 2 q 1 a 1 q 2 a 2 q 2 a a b B Q b 1 q 1 b 1 q 2 b 2 q 1 b 2 q 2 q q c c C Q c 1 q 2 c 2 q 2 c 1 q 1 c 2 q 1 25
26 SD Operator Stochastic-SD (S-SD, opt. w.r.t N 1 ) Given two objects U and V, and the query Q, we have SSD(U,V,Q) if and only if U Q st V Q and U Q V Q. Object C A Q a 1 q 1 a 2 q 1 a 1 q 2 a 2 q 2 a a b B Q b 1 q 1 b 1 q 2 b 2 q 1 b 2 q 2 q q c c C Q c 1 q 2 c 2 q 2 c 1 q 1 c 2 q 1 26
27 SD Operator Stochastic-SD (S-SD, opt. w.r.t N 1 ) Given two objects U and V, and the query Q, we have SSD(U,V,Q) if and only if U Q st V Q and U Q V Q. Object C A Q a 1 q 1 a 2 q 1 a 1 q 2 a 2 q 2 a a b B Q b 1 q 1 b 1 q 2 b 2 q 1 b 2 q 2 q q c c C Q c 1 q 2 c 2 q 2 c 1 q 1 c 2 q 1 27
28 SD Operator Stochastic-SD (S-SD, opt. w.r.t N 1 ) Given two objects U and V, and the query Q, we have SSD(U,V,Q) if and only if U Q st V Q and U Q V Q. Object C A Q a 1 q 1 a 2 q 1 a 1 q 2 a 2 q 2 a a b B Q b 1 q 1 b 1 q 2 b 2 q 1 b 2 q 2 q q c c C Q c 1 q 2 c 2 q 2 c 1 q 1 c 2 q 1 28
29 SD Operator Strict Stochastic-SD (SS-SD, opt. w.r.t N 1,2 ) Given two objects U and V, and the query Q, we have SSSD(U,V,Q) if and only if U q st V q for q Q and U Q V Q. A q1 A q2 Object C A Q a 1 q 1 a 2 q 1 a 1 q 2 a 2 q 2 a a b B Q b 1 q 1 b 1 q 2 b 2 q 1 b 2 q 2 q q c c C Q c 1 q 2 c 2 q 2 c 1 q 1 c 2 q 1 C q2 C q1 29
30 SD Operator Strict Stochastic-SD (SS-SD, opt. w.r.t N 1,2 ) Given two objects U and V, and the query Q, we have SSSD(U,V,Q) if and only if U q st V q for q Q and U Q V Q. A q1 A q2 Object C A Q a 1 q 1 a 2 q 1 a 1 q 2 a 2 q 2 a a b B Q b 1 q 1 b 1 q 2 b 2 q 1 b 2 q 2 q q c c C Q c 1 q 2 c 2 q 2 c 1 q 1 c 2 q 1 C q2 C q1 30
31 SD Operator Peer-SD (P-SD, opt. w.r.t N 1,2,3 ) Given two objects U and V, and the query Q, we have PSD(U,V,Q) if there is a mapping between U and V, for each pair <u,v>, u is always closer to Q than v with same weight. b b a a a a q q q q
32 SD Operator Peer-SD (P-SD, opt. w.r.t N 1,2,3 ) Given two objects U and V, and the query Q, we have PSD(U,V,Q) if there is a mapping between U and V, for each pair <u,v>, u is always closer to Q than v with same weight. b b b a a a a q q a q q
33 SD Operator Peer-SD (P-SD, opt. w.r.t N 1,2,3 ) Given two objects U and V, and the query Q, we have PSD(U,V,Q) if there is a mapping between U and V, for each pair <u,v>, u is always closer to Q than v with same weight. b b b a a a a q q a q q
34 P-SD Check P-SD check Naively have to check all the mapping. The P-SD check between U and V can be reduced to compute the network flow problem, PSD(U,V,Q) iff the network flow is 1 b a a s a 1 b 1 t q a 2 b 2 q
35 P-SD Check P-SD check Naively have to check all the mapping. The P-SD check between U and V can be reduced to compute the network flow problem, PSD(U,V,Q) iff the network flow is 1 b a a s a 1 b 1 t q a 2 b 2 q
36 SD Operator Properties SD operator containment NNC(O, Q, S-SD) NNC(O, Q, SS-SD) NNC(O, Q, P-SD) NNC(O, Q, F-SD) F-SD based NN P-SD based NN SS-SD based NN S-SD based NN N 1,2,3 N 1,2,3 N 1,2 N 1 NN candidate search Based on branch and bound framework like skyline search 36
37 Experiments Compare Algorithm SSD, SSSD, PSD, FSD and F + SD Datasets: real dataset: NBA, Gowalla; semi-real dataset: House, CA, USA; synthetic dataset. Evaluation parameter Values dimensionality d 2, 3, 4, 5 # of objects n 100k, 200k, 400k, 600k, 1M # of object instances m d 20, 40, 60, 80, 100 edge length of object h d 100, 200, 300, 400, 500 object center distribution anti (A), indep (E) # of query instances m q 10, 20, 30, 40, 50 edge length of query h q 100, 200, 300, 400,
38 Experiments (a) Candidate Size of Different Datasets (b) Response Time of Different Datasets 38
39 Experiments (a) Varying n (c) Varying n (b) Varying h d (d) Varying h d 39
40 Conclusion Formalize three families of NN functions that cover popular NN ranking mechanisms. Advocate three SD operator that are optimal to different family of NN functions. Propose efficient NN candidate search algorithm for three SD operators. 40
41 Thanks! Q&A 41
42 Experiments 42
43 Experiments 43
A Survey on Spatial-Keyword Search
A Survey on Spatial-Keyword Search (COMP 6311C Advanced Data Management) Nikolaos Armenatzoglou 06/03/2012 Outline Problem Definition and Motivation Query Types Query Processing Techniques Indices Algorithms
More informationPivot Selection Techniques
Pivot Selection Techniques Proximity Searching in Metric Spaces by Benjamin Bustos, Gonzalo Navarro and Edgar Chávez Catarina Moreira Outline Introduction Pivots and Metric Spaces Pivots in Nearest Neighbor
More informationA Novel Probabilistic Pruning Approach to Speed Up Similarity Queries in Uncertain Databases
A Novel Probabilistic Pruning Approach to Speed Up Similarity Queries in Uncertain Databases Thomas Bernecker #, Tobias Emrich #, Hans-Peter Kriegel #, Nikos Mamoulis, Matthias Renz #, Andreas Züfle #
More informationarxiv: v2 [cs.db] 5 May 2011
Inverse Queries For Multidimensional Spaces Thomas Bernecker 1, Tobias Emrich 1, Hans-Peter Kriegel 1, Nikos Mamoulis 2, Matthias Renz 1, Shiming Zhang 2, and Andreas Züfle 1 arxiv:113.172v2 [cs.db] May
More informationAlternative Clustering, Multiview Clustering: What Can We Learn From Each Other?
LUDWIG- MAXIMILIANS- UNIVERSITÄT MÜNCHEN INSTITUTE FOR INFORMATICS DATABASE Subspace Clustering, Ensemble Clustering, Alternative Clustering, Multiview Clustering: What Can We Learn From Each Other? MultiClust@KDD
More informationMining bi-sets in numerical data
Mining bi-sets in numerical data Jérémy Besson, Céline Robardet, Luc De Raedt and Jean-François Boulicaut Institut National des Sciences Appliquées de Lyon - France Albert-Ludwigs-Universitat Freiburg
More informationk-selection Query over Uncertain Data
k-selection Query over Uncertain Data Xingjie Liu 1 Mao Ye 2 Jianliang Xu 3 Yuan Tian 4 Wang-Chien Lee 5 Department of Computer Science and Engineering, The Pennsylvania State University, University Park,
More informationSkylines. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland
Yufei Tao ITEE University of Queensland Today we will discuss problems closely related to the topic of multi-criteria optimization, where one aims to identify objects that strike a good balance often optimal
More informationPoint-of-Interest Recommendations: Learning Potential Check-ins from Friends
Point-of-Interest Recommendations: Learning Potential Check-ins from Friends Huayu Li, Yong Ge +, Richang Hong, Hengshu Zhu University of North Carolina at Charlotte + University of Arizona Hefei University
More informationQuerying Uncertain Spatio-Temporal Data
Querying Uncertain Spatio-Temporal Data Tobias Emrich #, Hans-Peter Kriegel #, Nikos Mamoulis 2, Matthias Renz #, Andreas Züfle # # Institute for Informatics, Ludwig-Maximilians-Universität München Oettingenstr.
More informationA Bayesian Method for Guessing the Extreme Values in a Data Set
A Bayesian Method for Guessing the Extreme Values in a Data Set Mingxi Wu University of Florida May, 2008 Mingxi Wu (University of Florida) May, 2008 1 / 74 Outline Problem Definition Example Applications
More informationHans-Peter Kriegel, Peer Kröger, Irene Ntoutsi, Arthur Zimek
Hans-Peter Kriegel, Peer Kröger, Irene Ntoutsi, Arthur Zimek SSDBM, 20-22/7/2011, Portland OR Ludwig-Maximilians-Universität (LMU) Munich, Germany www.dbs.ifi.lmu.de Motivation Subspace clustering for
More informationk-nearest Neighbor Classification over Semantically Secure Encry
k-nearest Neighbor Classification over Semantically Secure Encrypted Relational Data Reporter:Ximeng Liu Supervisor: Rongxing Lu School of EEE, NTU May 9, 2014 1 2 3 4 5 Outline 1. Samanthula B K, Elmehdwi
More informationMario A. Nascimento. Univ. of Alberta, Canada http: //
DATA CACHING IN W Mario A. Nascimento Univ. of Alberta, Canada http: //www.cs.ualberta.ca/~mn With R. Alencar and A. Brayner. Work partially supported by NSERC and CBIE (Canada) and CAPES (Brazil) Outline
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationBatch Mode Sparse Active Learning. Lixin Shi, Yuhang Zhao Tsinghua University
Batch Mode Sparse Active Learning Lixin Shi, Yuhang Zhao Tsinghua University Our work Propose an unified framework of batch mode active learning Instantiate the framework using classifiers based on sparse
More informationFinding Pareto Optimal Groups: Group based Skyline
Finding Pareto Optimal Groups: Group based Skyline Jinfei Liu Emory University jinfei.liu@emory.edu Jun Luo Lenovo; CAS jun.luo@siat.ac.cn Li Xiong Emory University lxiong@emory.edu Haoyu Zhang Emory University
More informationK-Nearest Neighbor Temporal Aggregate Queries
Experiments and Conclusion K-Nearest Neighbor Temporal Aggregate Queries Yu Sun Jianzhong Qi Yu Zheng Rui Zhang Department of Computing and Information Systems University of Melbourne Microsoft Research,
More informationDraft Water Resources Management Plan 2019 Annex 3: Supply forecast Appendix B: calibration of the synthetic weather generator
Draft Water Resources Management Plan 2019 Annex 3: Supply forecast Appendix B: calibration of the synthetic weather generator November 30, 2017 Version 1 Introduction This appendix contains calibration
More informationProbabilistic Nearest-Neighbor Query on Uncertain Objects
Probabilistic Nearest-Neighbor Query on Uncertain Objects Hans-Peter Kriegel, Peter Kunath, Matthias Renz University of Munich, Germany {kriegel, kunath, renz}@dbs.ifi.lmu.de Abstract. Nearest-neighbor
More informationConstraint-based Subspace Clustering
Constraint-based Subspace Clustering Elisa Fromont 1, Adriana Prado 2 and Céline Robardet 1 1 Université de Lyon, France 2 Universiteit Antwerpen, Belgium Thursday, April 30 Traditional Clustering Partitions
More informationKnowledge Discovery in Data: Overview. Naïve Bayesian Classification. .. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar..
Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar Knowledge Discovery in Data: Naïve Bayes Overview Naïve Bayes methodology refers to a probabilistic approach to information discovery
More informationData Analytics Beyond OLAP. Prof. Yanlei Diao
Data Analytics Beyond OLAP Prof. Yanlei Diao OPERATIONAL DBs DB 1 DB 2 DB 3 EXTRACT TRANSFORM LOAD (ETL) METADATA STORE DATA WAREHOUSE SUPPORTS OLAP DATA MINING INTERACTIVE DATA EXPLORATION Overview of
More informationCitation for published version (APA): Andogah, G. (2010). Geographically constrained information retrieval Groningen: s.n.
University of Groningen Geographically constrained information retrieval Andogah, Geoffrey IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from
More informationClustering Lecture 1: Basics. Jing Gao SUNY Buffalo
Clustering Lecture 1: Basics Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics Clustering
More information6.842 Randomness and Computation Lecture 5
6.842 Randomness and Computation 2012-02-22 Lecture 5 Lecturer: Ronitt Rubinfeld Scribe: Michael Forbes 1 Overview Today we will define the notion of a pairwise independent hash function, and discuss its
More informationFantope Regularization in Metric Learning
Fantope Regularization in Metric Learning CVPR 2014 Marc T. Law (LIP6, UPMC), Nicolas Thome (LIP6 - UPMC Sorbonne Universités), Matthieu Cord (LIP6 - UPMC Sorbonne Universités), Paris, France Introduction
More informationCrowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons
2015 The University of Texas at Arlington. All Rights Reserved. Crowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons Abolfazl Asudeh, Gensheng Zhang, Naeemul Hassan, Chengkai Li, Gergely
More informationCS 188: Artificial Intelligence Spring Announcements
CS 188: Artificial Intelligence Spring 2010 Lecture 22: Nearest Neighbors, Kernels 4/18/2011 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements On-going: contest (optional and FUN!)
More informationFinding Hierarchies of Subspace Clusters
Proc. 10th Europ. Conf. on Principles and Practice of Knowledge Discovery in Databases (PKDD), Berlin, Germany, 2006 Finding Hierarchies of Subspace Clusters Elke Achtert, Christian Böhm, Hans-Peter Kriegel,
More informationRobustness and bootstrap techniques in portfolio efficiency tests
Robustness and bootstrap techniques in portfolio efficiency tests Dept. of Probability and Mathematical Statistics, Charles University, Prague, Czech Republic July 8, 2013 Motivation Portfolio selection
More informationClassification of Publications Based on Statistical Analysis of Scientific Terms Distributions
AUSTRIAN JOURNAL OF STATISTICS Volume 37 (2008), Number 1, 109 118 Classification of Publications Based on Statistical Analysis of Scientific Terms Distributions Vaidas Balys and Rimantas Rudzkis Institute
More informationTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest Neighbors Frank Nielsen 1 Paolo Piro 2 Michel Barlaud 2 1 Ecole Polytechnique, LIX, Palaiseau, France 2 CNRS / University of Nice-Sophia Antipolis, Sophia
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Two sample T 2 test 1 Two sample T 2 test 2 Analogous to the univariate context, we
More informationStochastic Dominance and Prospect Dominance with Subjective Weighting Functions
Stochastic Dominance and Prospect Dominance with Subjective Weighting Functions Haim Levy *, Zvi Wiener * Second version 2/15/98 Please address your correspondence to Haim Levy at Business School, The
More informationInformation Retrieval
Introduction to Information Retrieval Lecture 11: Probabilistic Information Retrieval 1 Outline Basic Probability Theory Probability Ranking Principle Extensions 2 Basic Probability Theory For events A
More informationHigh-Dimensional Indexing by Distributed Aggregation
High-Dimensional Indexing by Distributed Aggregation Yufei Tao ITEE University of Queensland In this lecture, we will learn a new approach for indexing high-dimensional points. The approach borrows ideas
More informationOn Top-k Structural. Similarity Search. Pei Lee, Laks V.S. Lakshmanan University of British Columbia Vancouver, BC, Canada
On Top-k Structural 1 Similarity Search Pei Lee, Laks V.S. Lakshmanan University of British Columbia Vancouver, BC, Canada Jeffrey Xu Yu Chinese University of Hong Kong Hong Kong, China 2014/10/14 Pei
More informationTypes of spatial data. The Nature of Geographic Data. Types of spatial data. Spatial Autocorrelation. Continuous spatial data: geostatistics
The Nature of Geographic Data Types of spatial data Continuous spatial data: geostatistics Samples may be taken at intervals, but the spatial process is continuous e.g. soil quality Discrete data Irregular:
More informationFilter Methods. Part I : Basic Principles and Methods
Filter Methods Part I : Basic Principles and Methods Feature Selection: Wrappers Input: large feature set Ω 10 Identify candidate subset S Ω 20 While!stop criterion() Evaluate error of a classifier using
More informationPerformance Evaluation. Analyzing competitive influence maximization problems with partial information: An approximation algorithmic framework
Performance Evaluation ( ) Contents lists available at ScienceDirect Performance Evaluation journal homepage: www.elsevier.com/locate/peva Analyzing competitive influence maximization problems with partial
More informationMachine Learning Basics
Security and Fairness of Deep Learning Machine Learning Basics Anupam Datta CMU Spring 2019 Image Classification Image Classification Image classification pipeline Input: A training set of N images, each
More informationCS 188: Artificial Intelligence Spring Announcements
CS 188: Artificial Intelligence Spring 2010 Lecture 24: Perceptrons and More! 4/22/2010 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements W7 due tonight [this is your last written for
More informationDefinitions and Proofs
Giving Advice vs. Making Decisions: Transparency, Information, and Delegation Online Appendix A Definitions and Proofs A. The Informational Environment The set of states of nature is denoted by = [, ],
More informationarxiv: v2 [cs.db] 20 Jan 2014
Probabilistic Nearest Neighbor Queries on Uncertain Moving Object Trajectories Submitted for Peer Review 1.5.213 Please get the significantly revised camera-ready version under http://www.vldb.org/pvldb/vol7/p25-niedermayer.pdf
More informationAnnouncements. CS 188: Artificial Intelligence Spring Classification. Today. Classification overview. Case-Based Reasoning
CS 188: Artificial Intelligence Spring 21 Lecture 22: Nearest Neighbors, Kernels 4/18/211 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements On-going: contest (optional and FUN!) Remaining
More informationEvaluation of Probabilistic Queries over Imprecise Data in Constantly-Evolving Environments
Evaluation of Probabilistic Queries over Imprecise Data in Constantly-Evolving Environments Reynold Cheng, Dmitri V. Kalashnikov Sunil Prabhakar The Hong Kong Polytechnic University, Hung Hom, Kowloon,
More informationTime-aware Point-of-interest Recommendation
Time-aware Point-of-interest Recommendation Quan Yuan, Gao Cong, Zongyang Ma, Aixin Sun, and Nadia Magnenat-Thalmann School of Computer Engineering Nanyang Technological University Presented by ShenglinZHAO
More informationScalable Processing of Snapshot and Continuous Nearest-Neighbor Queries over One-Dimensional Uncertain Data
VLDB Journal manuscript No. (will be inserted by the editor) Scalable Processing of Snapshot and Continuous Nearest-Neighbor Queries over One-Dimensional Uncertain Data Jinchuan Chen Reynold Cheng Mohamed
More informationForecasting Regional Employment in Germany: A Review of Neural Network Approaches. Objectives:
Forecasting Regional Employment in Germany: A Review of Neural Network Approaches Peter Nijkamp, Aura Reggiani, Roberto Patuelli Objectives: To develop and apply Neural Network (NN) models in order to
More informationMoment-based Availability Prediction for Bike-Sharing Systems
Moment-based Availability Prediction for Bike-Sharing Systems Jane Hillston Joint work with Cheng Feng and Daniël Reijsbergen LFCS, School of Informatics, University of Edinburgh http://www.quanticol.eu
More informationMaking rating curves - the Bayesian approach
Making rating curves - the Bayesian approach Rating curves what is wanted? A best estimate of the relationship between stage and discharge at a given place in a river. The relationship should be on the
More informationModeling of Atmospheric Effects on InSAR Measurements With the Method of Stochastic Simulation
Modeling of Atmospheric Effects on InSAR Measurements With the Method of Stochastic Simulation Z. W. LI, X. L. DING Department of Land Surveying and Geo-Informatics, Hong Kong Polytechnic University, Hung
More informationDistance Metric Learning
Distance Metric Learning Technical University of Munich Department of Informatics Computer Vision Group November 11, 2016 M.Sc. John Chiotellis: Distance Metric Learning 1 / 36 Outline Computer Vision
More informationAssignment 2 (Sol.) Introduction to Machine Learning Prof. B. Ravindran
Assignment 2 (Sol.) Introduction to Machine Learning Prof. B. Ravindran 1. Let A m n be a matrix of real numbers. The matrix AA T has an eigenvector x with eigenvalue b. Then the eigenvector y of A T A
More informationProbabilistic and Logistic Circuits: A New Synthesis of Logic and Machine Learning
Probabilistic and Logistic Circuits: A New Synthesis of Logic and Machine Learning Guy Van den Broeck KULeuven Symposium Dec 12, 2018 Outline Learning Adding knowledge to deep learning Logistic circuits
More information4.2 The Normal Distribution. that is, a graph of the measurement looks like the familiar symmetrical, bell-shaped
4.2 The Normal Distribution Many physiological and psychological measurements are normality distributed; that is, a graph of the measurement looks like the familiar symmetrical, bell-shaped distribution
More informationRanking Sequential Patterns with Respect to Significance
Ranking Sequential Patterns with Respect to Significance Robert Gwadera, Fabio Crestani Universita della Svizzera Italiana Lugano, Switzerland Abstract. We present a reliable universal method for ranking
More informationWhom to Ask? Jury Selection for Decision Making Tasks on Micro-blog Services
Whom to Ask? Jury Selection for Decision Making Tasks on Micro-blog Services Caleb Chen CAO, Jieying SHE, Yongxin TONG, Lei CHEN The Hong Kong University of Science and Technology Is Istanbul the capital
More information(i) The optimisation problem solved is 1 min
STATISTICAL LEARNING IN PRACTICE Part III / Lent 208 Example Sheet 3 (of 4) By Dr T. Wang You have the option to submit your answers to Questions and 4 to be marked. If you want your answers to be marked,
More informationIR Models: The Probabilistic Model. Lecture 8
IR Models: The Probabilistic Model Lecture 8 ' * ) ( % $ $ +#! "#! '& & Probability of Relevance? ' ', IR is an uncertain process Information need to query Documents to index terms Query terms and index
More informationModel-based probabilistic frequent itemset mining
Knowl Inf Syst (2013) 37:181 217 DOI 10.1007/s10115-012-0561-2 REGULAR PAPER Model-based probabilistic frequent itemset mining Thomas Bernecker Reynold Cheng David W. Cheung Hans-Peter Kriegel Sau Dan
More informationNearest neighbor classification in metric spaces: universal consistency and rates of convergence
Nearest neighbor classification in metric spaces: universal consistency and rates of convergence Sanjoy Dasgupta University of California, San Diego Nearest neighbor The primeval approach to classification.
More informationModern Information Retrieval
Modern Information Retrieval Chapter 8 Text Classification Introduction A Characterization of Text Classification Unsupervised Algorithms Supervised Algorithms Feature Selection or Dimensionality Reduction
More informationOutline. Geographic Information Analysis & Spatial Data. Spatial Analysis is a Key Term. Lecture #1
Geographic Information Analysis & Spatial Data Lecture #1 Outline Introduction Spatial Data Types: Objects vs. Fields Scale of Attribute Measures GIS and Spatial Analysis Spatial Analysis is a Key Term
More informationProbabilistic Similarity Search for Uncertain Time Series
Probabilistic Similarity Search for Uncertain Time Series Johannes Aßfalg, Hans-Peter Kriegel, Peer Kröger, Matthias Renz Ludwig-Maximilians-Universität München, Oettingenstr. 67, 80538 Munich, Germany
More informationNearest Neighbor. Machine Learning CSE546 Kevin Jamieson University of Washington. October 26, Kevin Jamieson 2
Nearest Neighbor Machine Learning CSE546 Kevin Jamieson University of Washington October 26, 2017 2017 Kevin Jamieson 2 Some data, Bayes Classifier Training data: True label: +1 True label: -1 Optimal
More informationLecture 8. Spatial Estimation
Lecture 8 Spatial Estimation Lecture Outline Spatial Estimation Spatial Interpolation Spatial Prediction Sampling Spatial Interpolation Methods Spatial Prediction Methods Interpolating Raster Surfaces
More informationMaking Fast Buffer Insertion Even Faster via Approximation Techniques
Making Fast Buffer Insertion Even Faster via Approximation Techniques Zhuo Li, C. N. Sze, Jiang Hu and Weiping Shi Department of Electrical Engineering Texas A&M University Charles J. Alpert IBM Austin
More informationMIRA, SVM, k-nn. Lirong Xia
MIRA, SVM, k-nn Lirong Xia Linear Classifiers (perceptrons) Inputs are feature values Each feature has a weight Sum is the activation activation w If the activation is: Positive: output +1 Negative, output
More informationFinding Top-k Preferable Products
JOURNAL OF L A T E X CLASS FILES, VOL. 6, NO., JANUARY 7 Finding Top-k Preferable Products Yu Peng, Raymond Chi-Wing Wong and Qian Wan Abstract The importance of dominance and skyline analysis has been
More informationProbabilistic Aspects of Voting
Probabilistic Aspects of Voting UUGANBAATAR NINJBAT DEPARTMENT OF MATHEMATICS THE NATIONAL UNIVERSITY OF MONGOLIA SAAM 2015 Outline 1. Introduction to voting theory 2. Probability and voting 2.1. Aggregating
More informationDistance Measures. Objectives: Discuss Distance Measures Illustrate Distance Measures
Distance Measures Objectives: Discuss Distance Measures Illustrate Distance Measures Quantifying Data Similarity Multivariate Analyses Re-map the data from Real World Space to Multi-variate Space Distance
More informationTDT4173 Machine Learning
TDT4173 Machine Learning Lecture 3 Bagging & Boosting + SVMs Norwegian University of Science and Technology Helge Langseth IT-VEST 310 helgel@idi.ntnu.no 1 TDT4173 Machine Learning Outline 1 Ensemble-methods
More informationMining Classification Knowledge
Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology SE lecture revision 2013 Outline 1. Bayesian classification
More informationFoundation of Cryptography, Lecture 4 Pseudorandom Functions
Foundation of Cryptography, Lecture 4 Pseudorandom Functions Handout Mode Iftach Haitner, Tel Aviv University Tel Aviv University. March 11, 2014 Iftach Haitner (TAU) Foundation of Cryptography March 11,
More informationCMPUT651: Differential Privacy
CMPUT65: Differential Privacy Homework assignment # 2 Due date: Apr. 3rd, 208 Discussion and the exchange of ideas are essential to doing academic work. For assignments in this course, you are encouraged
More informationAn Efficient reconciliation algorithm for social networks
An Efficient reconciliation algorithm for social networks Silvio Lattanzi (Google Research NY) Joint work with: Nitish Korula (Google Research NY) ICERM Stochastic Graph Models Outline Graph reconciliation
More informationFace Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition
ace Recognition Identify person based on the appearance of face CSED441:Introduction to Computer Vision (2017) Lecture10: Subspace Methods and ace Recognition Bohyung Han CSE, POSTECH bhhan@postech.ac.kr
More informationEfficient Computation of Trade-Off Skylines
Efficient Computation of Trade-Off Skylines Christoph Lofi Institute for Information Systems Mühlenpfordtstr. 23 38104 Braunschweig, Germany lofi@ifis.cs.tu-bs.de Ulrich Güntzer Institute of Computer Science
More informationCMU Social choice: Advanced manipulation. Teachers: Avrim Blum Ariel Procaccia (this time)
CMU 15-896 Social choice: Advanced manipulation Teachers: Avrim Blum Ariel Procaccia (this time) Recap A Complexity-theoretic barrier to manipulation Polynomial-time greedy alg can successfully decide
More informationA Three-Way Model for Collective Learning on Multi-Relational Data
A Three-Way Model for Collective Learning on Multi-Relational Data 28th International Conference on Machine Learning Maximilian Nickel 1 Volker Tresp 2 Hans-Peter Kriegel 1 1 Ludwig-Maximilians Universität,
More informationInconsistencies in Steady State. Thermodynamics
Inconsistencies in Steady State Thermodynamics Ronald Dickman and Ricardo Motai Universidade Federal de Minas Gerais OUTLINE Nonequilibrium steady states (NESS) Steady state thermodynamics (Oono-Paniconi,
More informationVector Space Model. Yufei Tao KAIST. March 5, Y. Tao, March 5, 2013 Vector Space Model
Vector Space Model Yufei Tao KAIST March 5, 2013 In this lecture, we will study a problem that is (very) fundamental in information retrieval, and must be tackled by all search engines. Let S be a set
More informationCleaning Uncertain Data for Top-k Queries
Cleaning Uncertain Data for Top- Queries Luyi Mo, Reynold Cheng, Xiang Li, David Cheung, Xuan Yang Department of Computer Science, University of Hong Kong, Pofulam Road, Hong Kong {lymo, ccheng, xli, dcheung,
More informationData Dependencies in the Presence of Difference
Data Dependencies in the Presence of Difference Tsinghua University sxsong@tsinghua.edu.cn Outline Introduction Application Foundation Discovery Conclusion and Future Work Data Dependencies in the Presence
More informationTutorial: Urban Trajectory Visualization. Case Studies. Ye Zhao
Case Studies Ye Zhao Use Cases We show examples of the web-based visual analytics system TrajAnalytics The case study information and videos are available at http://vis.cs.kent.edu/trajanalytics/ Porto
More informationBoolean and Vector Space Retrieval Models
Boolean and Vector Space Retrieval Models Many slides in this section are adapted from Prof. Joydeep Ghosh (UT ECE) who in turn adapted them from Prof. Dik Lee (Univ. of Science and Tech, Hong Kong) 1
More informationCOMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from
COMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from http://www.mmds.org Distance Measures For finding similar documents, we consider the Jaccard
More information(Re)introduction to Statistics Dan Lizotte
(Re)introduction to Statistics Dan Lizotte 2017-01-17 Statistics The systematic collection and arrangement of numerical facts or data of any kind; (also) the branch of science or mathematics concerned
More informationAnomaly Detection. Jing Gao. SUNY Buffalo
Anomaly Detection Jing Gao SUNY Buffalo 1 Anomaly Detection Anomalies the set of objects are considerably dissimilar from the remainder of the data occur relatively infrequently when they do occur, their
More informationLecture 21: HSP via the Pretty Good Measurement
Quantum Computation (CMU 5-859BB, Fall 205) Lecture 2: HSP via the Pretty Good Measurement November 8, 205 Lecturer: John Wright Scribe: Joshua Brakensiek The Average-Case Model Recall from Lecture 20
More informationPV211: Introduction to Information Retrieval
PV211: Introduction to Information Retrieval http://www.fi.muni.cz/~sojka/pv211 IIR 11: Probabilistic Information Retrieval Handout version Petr Sojka, Hinrich Schütze et al. Faculty of Informatics, Masaryk
More informationA Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.
A Probability Primer A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes. Are you holding all the cards?? Random Events A random event, E,
More informationNetwork Localization via Schatten Quasi-Norm Minimization
Network Localization via Schatten Quasi-Norm Minimization Anthony Man-Cho So Department of Systems Engineering & Engineering Management The Chinese University of Hong Kong (Joint Work with Senshan Ji Kam-Fung
More informationNearest Neighbor Search with Keywords in Spatial Databases
776 Nearest Neighbor Search with Keywords in Spatial Databases 1 Sphurti S. Sao, 2 Dr. Rahila Sheikh 1 M. Tech Student IV Sem, Dept of CSE, RCERT Chandrapur, MH, India 2 Head of Department, Dept of CSE,
More informationENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition
ENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition Wing-Kin (Ken) Ma 2017 2018 Term 2 Department of Electronic Engineering The Chinese University of Hong Kong Lecture 8: QR Decomposition
More informationSTOCHASTIC MODELS LECTURE 1 MARKOV CHAINS. Nan Chen MSc Program in Financial Engineering The Chinese University of Hong Kong (ShenZhen) Sept.
STOCHASTIC MODELS LECTURE 1 MARKOV CHAINS Nan Chen MSc Program in Financial Engineering The Chinese University of Hong Kong (ShenZhen) Sept. 6, 2016 Outline 1. Introduction 2. Chapman-Kolmogrov Equations
More information1 Heuristics for the Traveling Salesman Problem
Praktikum Algorithmen-Entwurf (Teil 9) 09.12.2013 1 1 Heuristics for the Traveling Salesman Problem We consider the following problem. We want to visit all the nodes of a graph as fast as possible, visiting
More informationarxiv: v1 [cs.db] 2 Sep 2014
An LSH Index for Computing Kendall s Tau over Top-k Lists Koninika Pal Saarland University Saarbrücken, Germany kpal@mmci.uni-saarland.de Sebastian Michel Saarland University Saarbrücken, Germany smichel@mmci.uni-saarland.de
More information