Comparison between ontology distances (preliminary results)
|
|
- James Elmer Carson
- 6 years ago
- Views:
Transcription
1 Comparison between ontology distances (preliminary results) Jérôme David, Jérôme Euzenat ISWC / 17
2 Context A distributed and heterogenous environment: the semantic Web Several ontologies on the same domain Problem: how to facilitate the communication between programs using different ontologies?? software program, peer, etc. software program, peer, etc. 2 / 17
3 Context A distributed and heterogenous environment: the semantic Web Several ontologies on the same domain Problem: how to facilitate the communication between programs using different ontologies? alignment = mediator software program, peer, etc. software program, peer, etc. 2 / 17
4 Context How to deal with many ontologies at Web scale? A solution: Distances between ontologies In peer-to-peer systems: find a peer using a close ontology In ontology engineering: find related ontologies for facilitating ontologies reuse etc. Objectives: Review some distances between ontologies distances issued from ontology matching Make a preliminary evaluation of these measures accuracy vs. speed 3 / 17
5 Various kinds of ontology distances Two kinds of ontology distances evaluated: Vectorial distances: ontologies are viewed as bags of terms 1. which weight scheme to use? 2. which vector distance to use? Entity-based distances: the distance between ontologies is function of the distance between their entities 1. entity distances: structural or/and lexical? 2. aggregating entity distance values: which collection distance to use? 4 / 17
6 Vectorial ontology distances RedWine Ontology Ontology vector... Médoc individual MyChateauMargauxBottle Château Margaux label comment ProducedIn GrapeVariety Producer Cantenac Médoc Ṁédoc wines are french red wine produced in the Bordeaux wine region. BordeauxArea Cabarnet Sauvignon Vintage Appellation Name City Merlot Margaux 1998 ChateauMargaux wine red médoc french bordeaux cabarnet sauvignon merlot margaux chateau cantenac 1998 grape appelation producer Vector measures Cosine Jaccard (Tanimoto index) Possible weights Boolean Frequency TF.IDF 5 / 17
7 Entity distances Label-based distance: String-distance between entity annotations : Jaro-Winkler Minimum Weight pairing + arithmetic mean Structural distances: OLA similarity Structural iterative similarity for OWL ontologies Triple-based iterative similarity Structural iterative similarity for RDF graphs Problem: How to aggregate entity measures values? concepts of o entity distance values concepts of o' ? ontology distance value? 6 / 17
8 Collection measures Goal: compute ontology measure from concept measure values. Collection measures used Average linkage : the arithmetic mean of concept measure values. concepts of o' concepts of o mean = / 17
9 Collection measures Goal: compute ontology measure from concept measure values. Collection measures used Average linkage : the arithmetic mean of concept measure values. Hausdorff : Hausdorff (o, o ) = max(max e o min e o δ K (e, e ), max e o min e o δ K (e, e )) concepts of o' concepts of o / 17
10 Collection measures Goal: compute ontology measure from concept measure values. Collection measures used Average linkage : the arithmetic mean of concept measure values. Hausdorff : Hausdorff (o, o ) = max(max e o min e o δ K (e, e ), max e o min e o δ K (e, e )) concepts of o' concepts of o / 17
11 Collection measures Goal: compute ontology measure from concept measure values. Collection measures used Average linkage : the arithmetic mean of concept measure values. Hausdorff : Hausdorff (o, o ) = max(max e o min e o δ K (e, e ), max e o min e o δ K (e, e )) concepts of o' concepts of o / 17
12 Collection measures Goal: compute ontology measure from concept measure values. Collection measures used Average linkage : the arithmetic mean of concept measure values. Hausdorff : Hausdorff (o, o ) = max(max e o min e o δ K (e, e ), max e o min e o δ K (e, e )) Minimum Weight Maximum Graph Matching distance: concepts of o' concepts of o mean = / 17
13 Selected Measures 12 measures evaluated: 3 vectorial measures: Jaccard: the most basic measure (proportion of common terms) Cosine + TF weights Cosine + TF IDF weights 3 entity measures: EntityLexicalMeasure TripleBasedEntitySim OLAEntitySim 9 entity based measures combined with 3 collections measures: AverageLinkage Hausdorff MWMGM 8 / 17
14 Benchmark suite OAEI benchmark test set: 1 reference ontology 101 and altered ontologies 6 kinds of alterations: Name: suppressed, randomized, synonyms, convention, other language Comment: suppressed, other language Specialization hierarchy: suppressed, expanded, flattened Instances: suppressed Properties: suppressed, discarded restrictions Classes: split, flattened Order between alterations Example on name: {suppressed, randomized} < { synonym, convention, other language } ontology with synonym names is closer to the reference one than ontology with no names 9 / 17
15 Benchmark suite Induced proximity order: 101 is closer to 225 than _101_ distance / 17
16 Benchmark suite Induced proximity order (with partial alterations): 11 / 17
17 Tests We performed 3 different tests: Order comparison on benchmark test set check if the similarity orders are compatible with the order induced by alterations For each o, o check if 101 < o < o δ(101, o) < δ(101, o ) - Test set is biased towards 1-1 matching: some measure favored 12 / 17
18 Tests We performed 3 different tests: Order comparison on benchmark test set check if the similarity orders are compatible with the order induced by alterations For each o, o check if 101 < o < o δ(101, o) < δ(101, o ) - Test set is biased towards 1-1 matching: some measure favored Different cardinality matching compare with related but different ontologies compare with unrelated ontologies 12 / 17
19 Tests We performed 3 different tests: Order comparison on benchmark test set check if the similarity orders are compatible with the order induced by alterations For each o, o check if 101 < o < o δ(101, o) < δ(101, o ) - Test set is biased towards 1-1 matching: some measure favored Different cardinality matching compare with related but different ontologies compare with unrelated ontologies Time consumption test compare the time consumption of evaluated measures 12 / 17
20 Order comparison Proportion of o, o satisfying the induced order: Measure Tests Passed (ratio) Original Enhanced MWMGM (EntityLexicalMeasure) Hausdorff (EntityLexicalMeasure) NaN NaN AverageLinkage (EntityLexicalMeasure) MWMGM (OLAEntitySim) Hausdorff (OLAEntitySim) AverageLinkage (OLAEntitySim) MWMGM (TripleBasedEntitySim) Hausdorff (TripleBasedEntitySim) AverageLinkage (TripleBasedEntitySim) CosineVM (TF) CosineVM (TFIDF) JaccardVM (TF) best entity measure : TripleBasedEntitySim 13 / 17
21 Order comparison Proportion of o, o satisfying the induced order: Measure Tests Passed (ratio) Original Enhanced MWMGM (EntityLexicalMeasure) Hausdorff (EntityLexicalMeasure) NaN NaN AverageLinkage (EntityLexicalMeasure) MWMGM (OLAEntitySim) Hausdorff (OLAEntitySim) AverageLinkage (OLAEntitySim) MWMGM (TripleBasedEntitySim) Hausdorff (TripleBasedEntitySim) AverageLinkage (TripleBasedEntitySim) CosineVM (TF) CosineVM (TFIDF) JaccardVM (TF) best entity measure : TripleBasedEntitySim best collection measure : MWMGM 13 / 17
22 Order comparison Proportion of o, o satisfying the induced order: Measure Tests Passed (ratio) Original Enhanced MWMGM (EntityLexicalMeasure) Hausdorff (EntityLexicalMeasure) NaN NaN AverageLinkage (EntityLexicalMeasure) MWMGM (OLAEntitySim) Hausdorff (OLAEntitySim) AverageLinkage (OLAEntitySim) MWMGM (TripleBasedEntitySim) Hausdorff (TripleBasedEntitySim) AverageLinkage (TripleBasedEntitySim) CosineVM (TF) CosineVM (TFIDF) JaccardVM (TF) best entity measure : TripleBasedEntitySim best collection measure : MWMGM best vectorial measure : Cosine with TF weights 13 / 17
23 Order comparison Proportion of o, o satisfying the induced order: Measure Tests Passed (ratio) Original Enhanced MWMGM (EntityLexicalMeasure) Hausdorff (EntityLexicalMeasure) NaN NaN AverageLinkage (EntityLexicalMeasure) MWMGM (OLAEntitySim) Hausdorff (OLAEntitySim) AverageLinkage (OLAEntitySim) MWMGM (TripleBasedEntitySim) Hausdorff (TripleBasedEntitySim) AverageLinkage (TripleBasedEntitySim) CosineVM (TF) CosineVM (TFIDF) JaccardVM (TF) best entity measure : TripleBasedEntitySim best collection measure : MWMGM best vectorial measure : Cosine with TF weights Best results: TripleBasedEntitySim with MWMGM Cosine with TF weights: basic measure but works well Hausdorff works only with TripleBasedEntitySim Is MWMGM favored by the 1-1 nature of this test? 13 / 17
24 Different cardinality matching Objective: Ensure that MWMGM is still the best measure when we use different cardinality ontologies Test set: A reference ontology: 101 Highly altered ontologies: 2xx Different but similar ontologies: 3xx Irrelevant ontologies: conference test set (could share some vocabulary with benchmark ontologies) Expected results: δ(101, 2xx) < δ(101, 3xx) < δ(101, conference) 14 / 17
25 Different cardinality matching Results: MWMGM still performs best, vectorial measures fail when ontologies are lexically altered Main misorderings: MWMGM Hausdorff : only one altered (250) with similar (3xx) and conference ontologies some similar (3xx) with conference ontologies with TripleBasedEntitySim: only 250 with similar (3xx) and a conference ontology with OLAEntitySim: 228, 248, 250, 251, 252 with 304 and conference ontologies AverageLinkage with LexicalEntitySim: does not work highly altered (2xx) and similar (3xx) ontologies Vectorial measures many altered (248, 250, 251, 252) ontologies with similar (3xx) ontologies 15 / 17
26 Time consumption Are measures useable in real-time applications? Vectorial measures are more time efficient Time consumption on benchmark test: Measure Total time Average time (s) per similarity value (s) MWMGM(EntityLexicalMeasure) MWMGM (OLAEntitySim) MWMGM (TripleBasedEntitySim) Hausdorff (EntityLexicalMeasure) Hausdorff (OLAEntitySim) Hausdorff (TripleBasedEntitySim) AverageLinkage (EntityLexicalMeasure) AverageLinkage (OLAEntitySim) AverageLinkage (TripleBasedEntitySim) CosineVM (TF) CosineVM (TFIDF) JaccardVM (TF) / 17
27 Time consumption Are measures useable in real-time applications? Vectorial measures are more time efficient Not significant differences between MWMGM (N 3 ), Hausdorff and AverageLinkage(N 2 ) Time consumption on benchmark test: Measure Total time Average time (s) per similarity value (s) MWMGM(EntityLexicalMeasure) MWMGM (OLAEntitySim) MWMGM (TripleBasedEntitySim) Hausdorff (EntityLexicalMeasure) Hausdorff (OLAEntitySim) Hausdorff (TripleBasedEntitySim) AverageLinkage (EntityLexicalMeasure) AverageLinkage (OLAEntitySim) AverageLinkage (TripleBasedEntitySim) CosineVM (TF) CosineVM (TFIDF) JaccardVM (TF) / 17
28 Time consumption Are measures useable in real-time applications? Vectorial measures are more time efficient Not significant differences between MWMGM (N 3 ), Hausdorff and AverageLinkage(N 2 ) Structural measures are not really useable in real-time applications Time consumption on benchmark test: Measure Total time Average time (s) per similarity value (s) MWMGM(EntityLexicalMeasure) MWMGM (OLAEntitySim) MWMGM (TripleBasedEntitySim) Hausdorff (EntityLexicalMeasure) Hausdorff (OLAEntitySim) Hausdorff (TripleBasedEntitySim) AverageLinkage (EntityLexicalMeasure) AverageLinkage (OLAEntitySim) AverageLinkage (TripleBasedEntitySim) CosineVM (TF) CosineVM (TFIDF) JaccardVM (TF) / 17
29 Conclusion Why use complex measures when basic measures performs well? for discriminating highly related ontologies More precise test set is needed Characterize confidence intervals Is there a threshold from which returned values are always correct? Other ontology distances measures: alignment based 17 / 17
Comparison between ontology distances (preliminary results)
Comparison between ontology distances (preliminary results) Jérôme David, Jérôme Euzenat To cite this version: Jérôme David, Jérôme Euzenat. Comparison between ontology distances (preliminary results).
More informationOntology similarity in the alignment space
Ontology similarity in the alignment space Jérôme David 1, Jérôme Euzenat 1, Ondřej Šváb-Zamazal 2 1 INRIA & LIG Grenoble, France {Jerome.David,Jerome.Euzenat}@inrialpes.fr 2 University of Economics Prague,
More informationUncertain Reasoning for Creating Ontology Mapping on the Semantic Web. Miklos Nagy, Maria Vargas-Vera and Enrico Motta
Uncertain Reasoning for Creating Mapping on the Semantic Web Miklos Nagy, Maria Vargas-Vera and Enrico Motta Outline Introduction and context Motivation: Question Answering (QA) Belief for uncertain similarities
More informationBoolean and Vector Space Retrieval Models
Boolean and Vector Space Retrieval Models Many slides in this section are adapted from Prof. Joydeep Ghosh (UT ECE) who in turn adapted them from Prof. Dik Lee (Univ. of Science and Tech, Hong Kong) 1
More informationCombining Sum-Product Network and Noisy-Or Model for Ontology Matching
Combining Sum-Product Network and Noisy-Or Model for Ontology Matching Weizhuo Li Institute of Mathematics,Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, P. R. China
More informationRetrieval by Content. Part 2: Text Retrieval Term Frequency and Inverse Document Frequency. Srihari: CSE 626 1
Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency Srihari: CSE 626 1 Text Retrieval Retrieval of text-based information is referred to as Information Retrieval (IR)
More informationInformation Retrieval and Web Search
Information Retrieval and Web Search IR models: Vector Space Model IR Models Set Theoretic Classic Models Fuzzy Extended Boolean U s e r T a s k Retrieval: Adhoc Filtering Brosing boolean vector probabilistic
More informationMapping of Science. Bart Thijs ECOOM, K.U.Leuven, Belgium
Mapping of Science Bart Thijs ECOOM, K.U.Leuven, Belgium Introduction Definition: Mapping of Science is the application of powerful statistical tools and analytical techniques to uncover the structure
More informationInformation Retrieval
Introduction to Information Retrieval CS276: Information Retrieval and Web Search Christopher Manning and Prabhakar Raghavan Lecture 6: Scoring, Term Weighting and the Vector Space Model This lecture;
More informationA conceptualization is a map from the problem domain into the representation. A conceptualization specifies:
Knowledge Sharing A conceptualization is a map from the problem domain into the representation. A conceptualization specifies: What sorts of individuals are being modeled The vocabulary for specifying
More informationBoolean and Vector Space Retrieval Models CS 290N Some of slides from R. Mooney (UTexas), J. Ghosh (UT ECE), D. Lee (USTHK).
Boolean and Vector Space Retrieval Models 2013 CS 290N Some of slides from R. Mooney (UTexas), J. Ghosh (UT ECE), D. Lee (USTHK). 1 Table of Content Boolean model Statistical vector space model Retrieval
More informationOntology-Based News Recommendation
Ontology-Based News Recommendation Wouter IJntema Frank Goossen Flavius Frasincar Frederik Hogenboom Erasmus University Rotterdam, the Netherlands frasincar@ese.eur.nl Outline Introduction Hermes: News
More informationExploiting WordNet as Background Knowledge
Exploiting WordNet as Background Knowledge Chantal Reynaud, Brigitte Safar LRI, Université Paris-Sud, Bât. G, INRIA Futurs Parc Club Orsay-Université - 2-4 rue Jacques Monod, F-91893 Orsay, France {chantal.reynaud,
More informationText Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University
Text Mining Dr. Yanjun Li Associate Professor Department of Computer and Information Sciences Fordham University Outline Introduction: Data Mining Part One: Text Mining Part Two: Preprocessing Text Data
More informationThe OntoNL Semantic Relatedness Measure for OWL Ontologies
The OntoNL Semantic Relatedness Measure for OWL Ontologies Anastasia Karanastasi and Stavros hristodoulakis Laboratory of Distributed Multimedia Information Systems and Applications Technical University
More informationNatural Language Processing. Topics in Information Retrieval. Updated 5/10
Natural Language Processing Topics in Information Retrieval Updated 5/10 Outline Introduction to IR Design features of IR systems Evaluation measures The vector space model Latent semantic indexing Background
More informationPolionto: Ontology reuse with Automatic Text Extraction from Political Documents
Polionto: Ontology reuse with Automatic Text Extraction from Political Documents Adela Ortiz PRODEI-The Doctoral Program in Informatics Engineering, FEUP- Faculty of Engineering of the University of Porto,
More informationChemical Space: Modeling Exploration & Understanding
verview Chemical Space: Modeling Exploration & Understanding Rajarshi Guha School of Informatics Indiana University 16 th August, 2006 utline verview 1 verview 2 3 CDK R utline verview 1 verview 2 3 CDK
More informationExtracting Modules from Ontologies: A Logic-based Approach
Extracting Modules from Ontologies: A Logic-based Approach Bernardo Cuenca Grau, Ian Horrocks, Yevgeny Kazakov and Ulrike Sattler The University of Manchester School of Computer Science Manchester, M13
More informationKnowledge Sharing. A conceptualization is a map from the problem domain into the representation. A conceptualization specifies:
Knowledge Sharing A conceptualization is a map from the problem domain into the representation. A conceptualization specifies: What sorts of individuals are being modeled The vocabulary for specifying
More informationAn OWL Ontology for Quantum Mechanics
An OWL Ontology for Quantum Mechanics Marcin Skulimowski Faculty of Physics and Applied Informatics, University of Lodz Pomorska 149/153, 90-236 Lodz, Poland mskulim@uni.lodz.pl Abstract. An OWL ontology
More informationUNIVERSITY OF CALGARY. The Automatic Grouping of Sensor Data Layers
UNIVERSITY OF CALGARY The Automatic Grouping of Sensor Data Layers Using Semantic Clustering and Classification to Group Semantically Similar Sensor Data Layers by Ben Charles Knoechel A THESIS SUBMITTED
More information.. CSC 566 Advanced Data Mining Alexander Dekhtyar..
.. CSC 566 Advanced Data Mining Alexander Dekhtyar.. Information Retrieval Latent Semantic Indexing Preliminaries Vector Space Representation of Documents: TF-IDF Documents. A single text document is a
More informationWeb Ontology Language (OWL)
Web Ontology Language (OWL) Need meaning beyond an object-oriented type system RDF (with RDFS) captures the basics, approximating an object-oriented type system OWL provides some of the rest OWL standardizes
More informationOn Algebraic Spectrum of Ontology Evaluation
Vol. 2, o. 7, 2 On Algebraic Spectrum of Ontology Evaluation Adekoya Adebayo Felix, Akinwale Adio Taofiki 2 Department of Computer Science, University of Agriculture, Abeokuta, igeria Abstract Ontology
More informationMotivation. User. Retrieval Model Result: Query. Document Collection. Information Need. Information Retrieval / Chapter 3: Retrieval Models
3. Retrieval Models Motivation Information Need User Retrieval Model Result: Query 1. 2. 3. Document Collection 2 Agenda 3.1 Boolean Retrieval 3.2 Vector Space Model 3.3 Probabilistic IR 3.4 Statistical
More informationAn Ontology and Concept Lattice Based Inexact Matching Method for Service Discovery
An Ontology and Concept Lattice Based Inexact Matching Method for Service Discovery Tang Shancheng Communication and Information Institute, Xi an University of Science and Technology, Xi an 710054, China
More informationSemaDrift: A Protégé Plugin for Measuring Semantic Drift in Ontologies
GRANT AGREEMENT: 601138 SCHEME FP7 ICT 2011.4.3 Promoting and Enhancing Reuse of Information throughout the Content Lifecycle taking account of Evolving Semantics [Digital Preservation] SemaDrift: A Protégé
More informationSEMANTIC ALIGNMENT OF DOCUMENTS WITH 3D CITY MODELS
SEMANTIC ALIGNMENT OF DOCUMENTS WITH 3D CITY MODELS Camille Tardy 1, Laurent Moccozet 2 and Gilles Falquet 1 University of Geneva 1 Institute ICLE 2 Institute of Services Science FACULTÉ DES SCIENCES ÉCONOMIQUES
More informationPublishing Value Added EO Products as Linked Data
Publishing Value Added EO Products as Linked Data Massimo Zotti Head of Government & Security SBU, Planetek Italia Pkm002-02-8.2 Planetek Group Roma Bari Athens 2 Key Numbers year 1994 Planetek Group SME
More informationJohn Pavlopoulos and Ion Androutsopoulos NLP Group, Department of Informatics Athens University of Economics and Business, Greece
John Pavlopoulos and Ion Androutsopoulos NLP Group, Department of Informatics Athens University of Economics and Business, Greece http://nlp.cs.aueb.gr/ A laptop with great design, but the service was
More informationScoring, Term Weighting and the Vector Space
Scoring, Term Weighting and the Vector Space Model Francesco Ricci Most of these slides comes from the course: Information Retrieval and Web Search, Christopher Manning and Prabhakar Raghavan Content [J
More informationA Survey of Temporal Knowledge Representations
A Survey of Temporal Knowledge Representations Advisor: Professor Abdullah Tansel Second Exam Presentation Knowledge Representations logic-based logic-base formalisms formalisms more complex and difficult
More informationSparse vectors recap. ANLP Lecture 22 Lexical Semantics with Dense Vectors. Before density, another approach to normalisation.
ANLP Lecture 22 Lexical Semantics with Dense Vectors Henry S. Thompson Based on slides by Jurafsky & Martin, some via Dorota Glowacka 5 November 2018 Previous lectures: Sparse vectors recap How to represent
More informationANLP Lecture 22 Lexical Semantics with Dense Vectors
ANLP Lecture 22 Lexical Semantics with Dense Vectors Henry S. Thompson Based on slides by Jurafsky & Martin, some via Dorota Glowacka 5 November 2018 Henry S. Thompson ANLP Lecture 22 5 November 2018 Previous
More informationIntroduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model
Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model Ranked retrieval Thus far, our queries have all been Boolean. Documents either
More information1 Information retrieval fundamentals
CS 630 Lecture 1: 01/26/2006 Lecturer: Lillian Lee Scribes: Asif-ul Haque, Benyah Shaparenko This lecture focuses on the following topics Information retrieval fundamentals Vector Space Model (VSM) Deriving
More informationMaschinelle Sprachverarbeitung
Maschinelle Sprachverarbeitung Retrieval Models and Implementation Ulf Leser Content of this Lecture Information Retrieval Models Boolean Model Vector Space Model Inverted Files Ulf Leser: Maschinelle
More informationTerm Weighting and the Vector Space Model. borrowing from: Pandu Nayak and Prabhakar Raghavan
Term Weighting and the Vector Space Model borrowing from: Pandu Nayak and Prabhakar Raghavan IIR Sections 6.2 6.4.3 Ranked retrieval Scoring documents Term frequency Collection statistics Weighting schemes
More informationCAIM: Cerca i Anàlisi d Informació Massiva
1 / 21 CAIM: Cerca i Anàlisi d Informació Massiva FIB, Grau en Enginyeria Informàtica Slides by Marta Arias, José Balcázar, Ricard Gavaldá Department of Computer Science, UPC Fall 2016 http://www.cs.upc.edu/~caim
More informationInference in Probabilistic Ontologies with Attributive Concept Descriptions and Nominals URSW 2008
Inference in Probabilistic Ontologies with Attributive Concept Descriptions and Nominals URSW 2008 Rodrigo B. Polastro - Fabio G. Cozman rodrigopolastro@usp.br - fgcozman@usp.br Decision Making Lab, Escola
More informationSPARQL Rewriting for Query Mediation over Mapped Ontologies
SPARQL Rewriting for Query Mediation over Mapped Ontologies Konstantinos Makris*, Nektarios Gioldasis*, Nikos Bikakis**, Stavros Christodoulakis* *Laboratory of Distributed Multimedia Information Systems
More informationAn Analysis on Consensus Measures in Group Decision Making
An Analysis on Consensus Measures in Group Decision Making M. J. del Moral Statistics and Operational Research Email: delmoral@ugr.es F. Chiclana CCI Faculty of Technology De Montfort University Leicester
More informationInformation Retrieval Basic IR models. Luca Bondi
Basic IR models Luca Bondi Previously on IR 2 d j q i IRM SC q i, d j IRM D, Q, R q i, d j d j = w 1,j, w 2,j,, w M,j T w i,j = 0 if term t i does not appear in document d j w i,j and w i:1,j assumed to
More informationFielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of Data
Fielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of Data Nikita Zhiltsov 1,2 Alexander Kotov 3 Fedor Nikolaev 3 1 Kazan Federal University 2 Textocat 3 Textual Data Analytics
More informationTheoretical approach to urban ontology: a contribution from urban system analysis
Theoretical approach to urban ontology: a contribution from urban system analysis University of Pisa Department of Civil Engineering Polytechnic of Milan Department of Architecture and Planning ing. Caglioni
More informationData-Driven Logical Reasoning
Data-Driven Logical Reasoning Claudia d Amato Volha Bryl, Luciano Serafini November 11, 2012 8 th International Workshop on Uncertainty Reasoning for the Semantic Web 11 th ISWC, Boston (MA), USA. Heterogeneous
More information2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes
2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or
More informationInvestigation of Latent Semantic Analysis for Clustering of Czech News Articles
Investigation of Latent Semantic Analysis for Clustering of Czech News Articles Michal Rott, Petr Červa Laboratory of Computer Speech Processing 4. 9. 2014 Introduction Idea of article clustering Presumptions:
More informationImproved Algorithms for Module Extraction and Atomic Decomposition
Improved Algorithms for Module Extraction and Atomic Decomposition Dmitry Tsarkov tsarkov@cs.man.ac.uk School of Computer Science The University of Manchester Manchester, UK Abstract. In recent years modules
More informationSemantic Similarity from Corpora - Latent Semantic Analysis
Semantic Similarity from Corpora - Latent Semantic Analysis Carlo Strapparava FBK-Irst Istituto per la ricerca scientifica e tecnologica I-385 Povo, Trento, ITALY strappa@fbk.eu Overview Latent Semantic
More informationUsing C-OWL for the Alignment and Merging of Medical Ontologies
Using C-OWL for the Alignment and Merging of Medical Ontologies Heiner Stuckenschmidt 1, Frank van Harmelen 1 Paolo Bouquet 2,3, Fausto Giunchiglia 2,3, Luciano Serafini 3 1 Vrije Universiteit Amsterdam
More informationRanked IR. Lecture Objectives. Text Technologies for Data Science INFR Learn about Ranked IR. Implement: 10/10/2018. Instructor: Walid Magdy
Text Technologies for Data Science INFR11145 Ranked IR Instructor: Walid Magdy 10-Oct-2018 Lecture Objectives Learn about Ranked IR TFIDF VSM SMART notation Implement: TFIDF 2 1 Boolean Retrieval Thus
More informationInformation Retrieval and Topic Models. Mausam (Based on slides of W. Arms, Dan Jurafsky, Thomas Hofmann, Ata Kaban, Chris Manning, Melanie Martin)
Information Retrieval and Topic Models Mausam (Based on slides of W. Arms, Dan Jurafsky, Thomas Hofmann, Ata Kaban, Chris Manning, Melanie Martin) Sec. 1.1 Unstructured data in 1620 Which plays of Shakespeare
More informationJournal of Theoretical and Applied Information Technology 30 th November Vol.96. No ongoing JATIT & LLS QUERYING RDF DATA
QUERYING RDF DATA 1,* ATTA-UR-RAHMAN, 2 FAHD ABDULSALAM ALHAIDARI 1,* Department of Computer Science, 2 Department of Computer Information System College of Computer Science and Information Technology,
More informationRelaxed Precision and Recall for Ontology Matching
Relaxed Precision and Recall for Ontology Matching Marc Ehrig, Jérôme Euzenat To cite this version: Marc Ehrig, Jérôme Euzenat. Relaxed Precision and Recall for Ontology Matching. Proc. K-Cap 2005 workshop
More informationOntologies for nanotechnology. Colin Batchelor
Ontologies for nanotechnology Colin Batchelor batchelorc@rsc.org 2009-03-25 Ontologies for nanotechnology: Outline What is an ontology and how do they work? What do people use ontologies for? How are we
More informationOWL Semantics COMP Sean Bechhofer Uli Sattler
OWL Semantics COMP62342 Sean Bechhofer sean.bechhofer@manchester.ac.uk Uli Sattler uli.sattler@manchester.ac.uk 1 Toward Knowledge Formalization Acquisition Process Elicit tacit knowledge A set of terms/concepts
More informationAn Algebra of Qualitative Taxonomical Relations for Ontology Alignments
An Algebra of Qualitative Taxonomical Relations for Ontology Alignments Armen Inants and Jérôme Euzenat Inria & Univ. Grenoble Alpes, Grenoble, France {Armen.Inants,Jerome.Euzenat}@inria.fr Abstract. Algebras
More informationMachine Learning for natural language processing
Machine Learning for natural language processing Classification: k nearest neighbors Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 28 Introduction Classification = supervised method
More informationInformation Retrieval
Introduction to Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Lecture 6: Scoring, Term Weighting and the Vector Space Model This lecture; IIR Sections
More informationSPARQL Formalization
SPARQL Formalization Marcelo Arenas, Claudio Gutierrez, Jorge Pérez Department of Computer Science Pontificia Universidad Católica de Chile Universidad de Chile Center for Web Research http://www.cwr.cl
More informationRanked IR. Lecture Objectives. Text Technologies for Data Science INFR Learn about Ranked IR. Implement: 10/10/2017. Instructor: Walid Magdy
Text Technologies for Data Science INFR11145 Ranked IR Instructor: Walid Magdy 10-Oct-017 Lecture Objectives Learn about Ranked IR TFIDF VSM SMART notation Implement: TFIDF 1 Boolean Retrieval Thus far,
More informationEnsemble Methods. NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan
Ensemble Methods NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan How do you make a decision? What do you want for lunch today?! What did you have last night?! What are your favorite
More informationResearch Article Anatomy Ontology Matching Using Markov Logic Networks
Scientifica Volume 2016, Article ID 1010946, 7 pages http://dx.doi.org/10.1155/2016/1010946 Research Article Anatomy Ontology Matching Using Markov Logic Networks Chunhua Li, Pengpeng Zhao, Jian Wu, and
More informationDESIGNING A CARTOGRAPHIC ONTOLOGY FOR USE WITH EXPERT SYSTEMS
DESIGNING A CARTOGRAPHIC ONTOLOGY FOR USE WITH EXPERT SYSTEMS Richard A. Smith University of Georgia, Geography Department, Athens, GA 30602 rasmith@uga.edu KEY WORDS: Ontology, Cartography, Expert Systems,
More informationResource Description Framework (RDF) A basis for knowledge representation on the Web
Resource Description Framework (RDF) A basis for knowledge representation on the Web Simple language to capture assertions (as statements) Captures elements of knowledge about a resource Facilitates incremental
More informationDeep Learning for Natural Language Processing. Sidharth Mudgal April 4, 2017
Deep Learning for Natural Language Processing Sidharth Mudgal April 4, 2017 Table of contents 1. Intro 2. Word Vectors 3. Word2Vec 4. Char Level Word Embeddings 5. Application: Entity Matching 6. Conclusion
More informationKey Words: geospatial ontologies, formal concept analysis, semantic integration, multi-scale, multi-context.
Marinos Kavouras & Margarita Kokla Department of Rural and Surveying Engineering National Technical University of Athens 9, H. Polytechniou Str., 157 80 Zografos Campus, Athens - Greece Tel: 30+1+772-2731/2637,
More informationEstimating query rewriting quality over LOD
Semantic Web 1 (2016) 1 5 1 IOS Press Estimating query rewriting quality over LOD Editor(s): Name Surname, University, Country Solicited review(s): Name Surname, University, Country Open review(s): Name
More informationDISTRIBUTIONAL SEMANTICS
COMP90042 LECTURE 4 DISTRIBUTIONAL SEMANTICS LEXICAL DATABASES - PROBLEMS Manually constructed Expensive Human annotation can be biased and noisy Language is dynamic New words: slangs, terminology, etc.
More informationGenerating Sentences by Editing Prototypes
Generating Sentences by Editing Prototypes K. Guu 2, T.B. Hashimoto 1,2, Y. Oren 1, P. Liang 1,2 1 Department of Computer Science Stanford University 2 Department of Statistics Stanford University arxiv
More informationLexical Analysis. DFA Minimization & Equivalence to Regular Expressions
Lexical Analysis DFA Minimization & Equivalence to Regular Expressions Copyright 26, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California
More informationTranslating Ontologies from Predicate-based to Frame-based Languages
1/18 Translating Ontologies from Predicate-based to Frame-based Languages Jos de Bruijn and Stijn Heymans Digital Enterprise Research Institute (DERI) University of Innsbruck, Austria {jos.debruijn,stijn.heymans}@deri.org
More informationA HYBRID SEMANTIC SIMILARITY MEASURING APPROACH FOR ANNOTATING WSDL DOCUMENTS WITH ONTOLOGY CONCEPTS. Received February 2017; revised May 2017
International Journal of Innovative Computing, Information and Control ICIC International c 2017 ISSN 1349-4198 Volume 13, Number 4, August 2017 pp. 1221 1242 A HYBRID SEMANTIC SIMILARITY MEASURING APPROACH
More informationTerm Weighting and Vector Space Model. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze
Term Weighting and Vector Space Model Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze 1 Ranked retrieval Thus far, our queries have all been Boolean. Documents either
More informationvector space retrieval many slides courtesy James Amherst
vector space retrieval many slides courtesy James Allan@umass Amherst 1 what is a retrieval model? Model is an idealization or abstraction of an actual process Mathematical models are used to study the
More informationMinimal Coverage for Ontology Signatures
Minimal Coverage for Ontology Signatures David Geleta, Terry R. Payne, and Valentina Tamma Department of Computer Science, University of Liverpool, UK {d.geleta t.r.payne v.tamma}@liverpool.ac.uk Abstract.
More informationWhat is Text mining? To discover the useful patterns/contents from the large amount of data that can be structured or unstructured.
What is Text mining? To discover the useful patterns/contents from the large amount of data that can be structured or unstructured. Text mining What can be used for text mining?? Classification/categorization
More informationLecture 2: Axiomatic semantics
Chair of Software Engineering Trusted Components Prof. Dr. Bertrand Meyer Lecture 2: Axiomatic semantics Reading assignment for next week Ariane paper and response (see course page) Axiomatic semantics
More informationOntological Modelling in Wikidata
Ontological Modelling in Wikidata Markus Krötzsch Knowledge-Based Systems, TU Dresden Workshop on Ontology Design and Patterns 2018 at ISWC 2018 Markus Krötzsch: Wikidata Toolkit Kickoff All slides CC-BY
More informationTowards Linkset Quality for Complementing SKOS Thesauri
Towards Linkset Quality for Complementing SKOS Thesauri Riccardo Albertoni, Monica De Martino, and Paola Podestà Istituto di Matematica Applicata e Tecnologie Informatiche Consiglio Nazionale delle Ricerche,
More informationLecture 2 August 31, 2007
CS 674: Advanced Language Technologies Fall 2007 Lecture 2 August 31, 2007 Prof. Lillian Lee Scribes: Cristian Danescu Niculescu-Mizil Vasumathi Raman 1 Overview We have already introduced the classic
More informationExploring Large Digital Library Collections using a Map-based Visualisation
Exploring Large Digital Library Collections using a Map-based Visualisation Dr Mark Hall Research Seminar, Department of Computing, Edge Hill University 7.11.2013 http://www.flickr.com/photos/carlcollins/199792939/
More informationSpatial Data on the Web Working Group. Geospatial RDA P9 Barcelona, 5 April 2017
Spatial Data on the Web Working Group Geospatial IG @ RDA P9 Barcelona, 5 April 2017 Spatial Data on the Web Joint effort of the World Wide Web Consortium (W3C) and the Open Geospatial Consortium (OGC)
More informationINFO 4300 / CS4300 Information Retrieval. slides adapted from Hinrich Schütze s, linked from
INFO 4300 / CS4300 Information Retrieval slides adapted from Hinrich Schütze s, linked from http://informationretrieval.org/ IR 26/26: Feature Selection and Exam Overview Paul Ginsparg Cornell University,
More informationCSE 158 Lecture 10. Web Mining and Recommender Systems. Text mining Part 2
CSE 158 Lecture 10 Web Mining and Recommender Systems Text mining Part 2 Midterm Midterm is this time next week! (Nov 7) I ll spend next Monday s lecture on prep See also seven previous midterms on the
More informationRecap of the last lecture. CS276A Information Retrieval. This lecture. Documents as vectors. Intuition. Why turn docs into vectors?
CS276A Information Retrieval Recap of the last lecture Parametric and field searches Zones in documents Scoring documents: zone weighting Index support for scoring tf idf and vector spaces Lecture 7 This
More informationScoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology
Scoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)
More informationFrancisco M. Couto Mário J. Silva Pedro Coutinho
Francisco M. Couto Mário J. Silva Pedro Coutinho DI FCUL TR 03 29 Departamento de Informática Faculdade de Ciências da Universidade de Lisboa Campo Grande, 1749 016 Lisboa Portugal Technical reports are
More informationPart I: Web Structure Mining Chapter 1: Information Retrieval and Web Search
Part I: Web Structure Mining Chapter : Information Retrieval an Web Search The Web Challenges Crawling the Web Inexing an Keywor Search Evaluating Search Quality Similarity Search The Web Challenges Tim
More informationOn Politics and Argumentation
On Politics and Argumentation Maria Boritchev To cite this version: Maria Boritchev. On Politics and Argumentation. MALOTEC, Mar 2017, Nancy, France. 2017. HAL Id: hal-01666416 https://hal.archives-ouvertes.fr/hal-01666416
More informationToponym Disambiguation using Ontology-based Semantic Similarity
Toponym Disambiguation using Ontology-based Semantic Similarity David S Batista 1, João D Ferreira 2, Francisco M Couto 2, and Mário J Silva 1 1 IST/INESC-ID Lisbon, Portugal {dsbatista,msilva}@inesc-id.pt
More informationScoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology
Scoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2017 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)
More informationDescriptive Data Summarization
Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning
More informationUniversity of Florida CISE department Gator Engineering. Clustering Part 1
Clustering Part 1 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville What is Cluster Analysis? Finding groups of objects such that the objects
More informationSemantics, ontologies and escience for the Geosciences
Semantics, ontologies and escience for the Geosciences Femke Reitsma¹, John Laxton², Stuart Ballard³, Werner Kuhn, Alia Abdelmoty ¹femke.reitsma@ed.ac.uk: School of Geosciences, Edinburgh University ²
More informationMachine Learning. Ludovic Samper. September 1st, Antidot. Ludovic Samper (Antidot) Machine Learning September 1st, / 77
Machine Learning Ludovic Samper Antidot September 1st, 2015 Ludovic Samper (Antidot) Machine Learning September 1st, 2015 1 / 77 Antidot Software vendor since 1999 Paris, Lyon, Aix-en-Provence 45 employees
More informationOutline Introduction Background Related Rl dw Works Proposed Approach Experiments and Results Conclusion
A Semantic Approach to Detecting Maritime Anomalous Situations ti José M Parente de Oliveira Paulo Augusto Elias Emilia Colonese Carrard Computer Science Department Aeronautics Institute of Technology,
More informationAnnotation algebras for RDFS
University of Edinburgh Semantic Web in Provenance Management, 2010 RDF definition U a set of RDF URI references (s, p, o) U U U an RDF triple s subject p predicate o object A finite set of triples an
More information