Rule Learning from Knowledge Graphs Guided by Embedding Models

Similar documents
Rule Learning from Knowledge Graphs Guided by Embedding Models

ParaGraphE: A Library for Parallel Knowledge Graph Embedding

arxiv: v1 [cs.ai] 12 Nov 2018

Completeness-aware Rule Learning from Knowledge Graphs

A Translation-Based Knowledge Graph Embedding Preserving Logical Property of Relations

Mining Inference Formulas by Goal-Directed Random Walks

Technical Report: Exception-enriched Rule Learning from Knowledge Graphs

Knowledge Graph Embedding with Diversity of Structures

arxiv: v2 [cs.cl] 28 Sep 2015

Fast rule mining in ontological knowledge bases with AMIE+

Towards Understanding the Geometry of Knowledge Graph Embeddings

2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,

A Convolutional Neural Network-based

TransG : A Generative Model for Knowledge Graph Embedding

TransT: Type-based Multiple Embedding Representations for Knowledge Graph Completion

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH

Rule Mining in Knowledge Bases. Fabian M. Suchanek

Interpretable and Compositional Relation Learning by Joint Training with an Autoencoder

Learning Multi-Relational Semantics Using Neural-Embedding Models

Data-Driven Logical Reasoning

Analogical Inference for Multi-Relational Embeddings

Jointly Embedding Knowledge Graphs and Logical Rules

Embedding-Based Techniques MATRICES, TENSORS, AND NEURAL NETWORKS

Rule mining with AMIE+

Probabilistic Abductive Logic Programming using Possible Worlds

A Novel Embedding Model for Knowledge Base Completion Based on Convolutional Neural Network

Knowledge Expansion over Probabilistic Knowledge Bases. Yang Chen, Daisy Zhe Wang

Learning Embedding Representations for Knowledge Inference on Imperfect and Incomplete Repositories

arxiv: v1 [cs.lg] 20 Jan 2019 Abstract

ECE 479/579 Principles of Artificial Intelligence Part I Spring Dr. Michael Marefat

Locally Adaptive Translation for Knowledge Graph Embedding

Type-Sensitive Knowledge Base Inference Without Explicit Type Supervision

Learning Entity and Relation Embeddings for Knowledge Graph Completion

Supplementary Material: Towards Understanding the Geometry of Knowledge Graph Embeddings

A Simple Model for Sequences of Relational State Descriptions

JOINT PROBABILISTIC INFERENCE OF CAUSAL STRUCTURE

A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions

REVISITING KNOWLEDGE BASE EMBEDDING AS TEN-

Modeling Topics and Knowledge Bases with Embeddings

Robust Discovery of Positive and Negative Rules in Knowledge-Bases

Regularizing Knowledge Graph Embeddings via Equivalence and Inversion Axioms

Knowledge Graph Completion with Adaptive Sparse Transfer Matrix

PROBABILISTIC KNOWLEDGE GRAPH EMBEDDINGS

arxiv: v1 [cs.ai] 16 Dec 2018

Probabilistic Inductive Constraint Logic

Markov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials)

Differentiable Learning of Logical Rules for Knowledge Base Reasoning

Presented By: Omer Shmueli and Sivan Niv

STransE: a novel embedding model of entities and relationships in knowledge bases

Markov Networks. l Like Bayes Nets. l Graph model that describes joint probability distribution using tables (AKA potentials)

Knowledge Graph Embedding via Dynamic Mapping Matrix

Probabilistic Logic CNF for Reasoning

NEAL: A Neurally Enhanced Approach to Linking Citation and Reference

Towards Time-Aware Knowledge Graph Completion

Web-Mining Agents. Multi-Relational Latent Semantic Analysis. Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme

UnSAID: Uncertainty and Structure in the Access to Intensional Data

A Probabilistic Approach for Integrating Heterogeneous Knowledge Sources

Connectionist Artificial Neural Networks

Bridging Weighted Rules and Graph Random Walks for Statistical Relational Models

Prof. Dr. Ralf Möller Dr. Özgür L. Özçep Universität zu Lübeck Institut für Informationssysteme. Tanya Braun (Exercises)

Deep Prolog: End-to-end Differentiable Proving in Knowledge Bases

Inference in Probabilistic Logic Programs using Weighted CNF s

Modeling Relation Paths for Representation Learning of Knowledge Bases

Modeling Relation Paths for Representation Learning of Knowledge Bases

Neighborhood Mixture Model for Knowledge Base Completion

Neural Networks and Machine Learning research at the Laboratory of Computer and Information Science, Helsinki University of Technology

Complex Embeddings for Simple Link Prediction

Fast and Accurate Causal Inference from Time Series Data

Applying Artificial Neural Networks to problems in Geology. Ben MacDonell

Integrating Induction and Deduction for Verification and Synthesis

The PITA System for Logical-Probabilistic Inference

Probabilistic Abductive Logic Programming: a Joint Approach of Logic and Probability

Getting Real with Unreal Data: Lessons Learned and the Way Ahead

Tractable Learning in Structured Probability Spaces

Extracting Reduced Logic Programs from Artificial Neural Networks

Computational Cognitive Science

HANDLING UNCERTAINTY IN INFORMATION EXTRACTION

Tractable Learning in Structured Probability Spaces

William Yang Wang Department of Computer Science University of California, Santa Barbara Santa Barbara, CA USA

Relation path embedding in knowledge graphs

Collaborative NLP-aided ontology modelling

Query-Time Reasoning in Uncertain RDF Knowledge Bases with Soft and Hard Rules

Learning Terminological Naïve Bayesian Classifiers Under Different Assumptions on Missing Knowledge

Predicate Logic: Introduction and Translations

PSDDs for Tractable Learning in Structured and Unstructured Spaces

KU Leuven Department of Computer Science

An Algorithm for Learning with Probabilistic Description Logics

L 2,1 Norm and its Applications

Probabilistic and Logistic Circuits: A New Synthesis of Logic and Machine Learning

Inductive Reasoning about Ontologies Using Conceptual Spaces

Logic Programming Theory Lecture 7: Negation as Failure

Dimensionality Reduction and Principle Components Analysis

Bottom-Up Induction of Feature Terms

Preference-Based Rank Elicitation using Statistical Models: The Case of Mallows

Data-Dependent Learning of Symmetric/Antisymmetric Relations for Knowledge Base Completion

Fast Logistic Regression for Text Categorization with Variable-Length N-grams

Support Vector Inductive Logic Programming

Description Logics. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca

Intelligent Agents. First Order Logic. Ute Schmid. Cognitive Systems, Applied Computer Science, Bamberg University. last change: 19.

Relational Learning. CS 760: Machine Learning Spring 2018 Mark Craven and David Page.

Transcription:

Rule Learning from Knowledge Graphs Guided by Embedding Models Vinh Thinh Ho 1, Daria Stepanova 1, Mohamed Gad-Elrab 1, Evgeny Kharlamov 2, Gerhard Weikum 1 1 Max Planck Institute for Informatics, Saarbrücken, Germany 2 University of Oxford, Oxford, United Kingdom ISWC 2018 1 / 13

Rule Learning from KGs 1 / 13

Rule Learning from KGs Confidence, e.g., WARMER [Goethals and den Bussche, 2002] CWA: whatever is missing is false r : livesin(x, Y ) ismarriedto(z, X), livesin(z, Y ) conf (r) = + = 2 4 1 / 13

Rule Learning from KGs Confidence, e.g., WARMER [Goethals and den Bussche, 2002] CWA: whatever is missing is false r : livesin(x, Y ) ismarriedto(z, X), livesin(z, Y ) conf (r) = + = 2 4 1 / 13

Rule Learning from KGs PCA confidence AMIE [Galárraga et al., 2015] PCA: Since Alice has a living place already, all others are incorrect. Brad ismarriedto Ann hasbrother John ismarriedto Kate livesin livesin livesin livesin Berlin Researcher Chicago livesin IsA IsA livesin Bob ismarriedto Alice Dave ismarriedto Clara Amsterdam livesin r : livesin(x, Y ) ismarriedto(z, X), livesin(z, Y ) conf PCA (r) = + = 2 3 1 / 13

Rule Learning from KGs Exception-enriched rules: [ISWC 2016, ILP 2016] Brad ismarriedto Ann hasbrother John ismarriedto Kate livesin livesin livesin livesin Berlin Researcher Chicago livesin IsA IsA livesin Bob ismarriedto Alice Dave ismarriedto Clara Amsterdam livesin conf (r) = conf PCA (r) = + =1 r : livesin(x, Y ) ismarriedto(z, X), livesin(z, Y ), not isa(x, researcher) 1 / 13

Absurd Rules due to Data Incompleteness Problem: rules learned from highly incomplete KGs might be absurd.. Brad worksat ITech John worksat OpenSys livesin officein livesin officein Berlin ItCompany Chicago officein IsA IsA officein Bob worksat ITService SoftComp worksat Clara conf (r) = conf PCA (r) = 1 livesin(x, Y ) worksat(x, Z), officein(z, Y ), not isa(z, itcompany) 2 / 13

Ideal KG µ(r, G i ): measure quality of the rule r on G i KG Ideal KG i 3 / 13

Ideal KG µ(r, G i ): measure quality of the rule r on G i, but G i is unknown KG Ideal KG i 3 / 13

Probabilistic Reconstruction of Ideal KG µ(r, G i p): measure quality of r on G i p KG i p probabilistic reconstruction of i 3 / 13

Hybrid Rule Measure µ(r, G i p) = (1 λ) µ 1 (r, G) + λ µ 2 (r, G i p) KG i p probabilistic reconstruction of i 3 / 13

Hybrid Rule Measure µ(r, G i p) = (1 λ) µ 1 (r, G) + λ µ 2 (r, G i p) λ [0..1] : weighting factor µ 1 : descriptive quality of rule r over the available KG G confidence PCA confidence µ 2 : predictive quality of r relying on Gp i (probabilistic reconstruction of the ideal KG G i ) 3 / 13

TransE [Bordes et al., 2013], SSP [Xiao et al., 2017], TEKE [Wang and Li, 2016] 4 / 13 KG Embeddings Popular approach to KG completion, which proved to be effective Relies on translation of entities and relations into vector spaces livesin Bob and Mat have successfully completed a project initiated by the SFO department of ITService Patti livesin Berlin livesin Jane colleagueof Bob is a developer Text in ITService, CA Bob SFO Berlin Mat Bob Jane Patti colleagueof colleagueof colleagueof SFO is a cultural and commercial center of CA Mat livesin SFO Mat working as a developer in ITService, CA score(<bob livesin SFO>) = 0.8 score(<bob livesin Berlin>) = 0.4...

Our Approach 5 / 13

Rule Construction Clause exploration from general to specific all first-order clauses: [Shapiro, 1991] livesin(x, Y ) add atom livesin(x, Y) livesin(u, V) unify variables livesin(x, X) unify variable to constant livesin(bob, Y) 6 / 13

Rule Construction Clause exploration from general to specific closed rules: AMIE [Galárraga et al., 2015] livesin(x, Y ) marriedto(x, Z), livesin(z, Y ) livesin(x, Y ) add dangling atom add instantiated atom livesin(x, Y) marriedto(x, Z) livesin(x, Y) isa(x, researcher) add closing atom livesin(x, Y) marriedto(x, Z), livesin(z, Y) 6 / 13

Rule Construction Clause exploration from general to specific closed rules: AMIE [Galárraga et al., 2015] livesin(x, Y ) marriedto(x, Z), livesin(z, Y ) livesin(x, Y ) add dangling atom add instantiated atom livesin(x, Y) marriedto(x, Z) livesin(x, Y) isa(x, researcher) add closing atom livesin(x, Y) marriedto(x, Z), livesin(z, Y) 6 / 13

Rule Construction Clause exploration from general to specific closed rules: AMIE [Galárraga et al., 2015] livesin(x, Y ) marriedto(x, Z), livesin(z, Y ) livesin(x, Y ) add dangling atom add instantiated atom livesin(x, Y) marriedto(x, Z) livesin(x, Y) isa(x, researcher) add closing atom livesin(x, Y) marriedto(x, Z), livesin(z, Y) 6 / 13

Rule Construction Clause exploration from general to specific This work: closed and safe rules with negation livesin(x, Y ) marriedto(x, Z), livesin(z, Y ), not isa(x, researcher) livesin(x, Y ) add dangling atom add instantiated atom livesin(x, Y) marriedto(x, Z) livesin(x, Y) isa(x, researcher) add closing atom livesin(x, Y) marriedto(x, Z), livesin(z, Y) add negated atom add negated instantiated atom livesin(x, Y) marriedto(x, Z), livesin(z, Y), not moved(x, Y) livesin(x, Y) marriedto(x, Z), livesin(z, Y), not isa(x, researcher) 6 / 13

Rule Construction Clause exploration from general to specific This work: closed and safe rules with negation livesin(x, Y ) marriedto(x, Z), livesin(z, Y ), not isa(x, researcher) livesin(x, Y ) add dangling atom add instantiated atom livesin(x, Y) marriedto(x, Z) livesin(x, Y) isa(x, researcher) add closing atom livesin(x, Y) marriedto(x, Z), livesin(z, Y) add negated atom add negated instantiated atom livesin(x, Y) marriedto(x, Z), livesin(z, Y), not moved(x, Y) livesin(x, Y) marriedto(x, Z), livesin(z, Y), not isa(x, researcher) 6 / 13

Rule Prunning livesin(x, Y) worksat(x, Z), officein(z, Y) livesin(x, Y) worksat(x, Z)... livesin(x, Y )... livesin(x, Y) marriedto(x, Z) livesin(z, Y)... livesin(x, Y) marriedto(x, Z), livesin(z, Y) not researcher(x) Prune rule search space relying on novel hybrid embedding-based rule measure 7 / 13

Embedding-based Rule Quality Estimate average quality of predictions made by a given rule r µ 2 (r, Gp) i 1 = G i predictions(r, G) p(fact) fact predictions(r,g) Rely on truthfulness of predictions made by r based on the probabilistic reconstruction Gp i of Gi 8 / 13

Embedding-based Rule Quality Estimate average quality of predictions made by a given rule r µ 2 (r, Gp) i 1 = G i predictions(r, G) p(fact) fact predictions(r,g) Rely on truthfulness of predictions made by r based on the probabilistic reconstruction Gp i of Gi Example: livesin(x, Y ) marriedto(x, Z), livesin(z, Y ) Rule predictions: livesin(mat, monterey),livesin(dave, chicago) µ 2 (r, G i p)= Gi p(< mat livesin monterey >)+G i p(< dave livesin chicago >) 2 8 / 13

Embedding-based Rule Quality Estimate average quality of predictions made by a given rule r µ 2 (r, Gp) i 1 = G i predictions(r, G) p(fact) fact predictions(r,g) Rely on truthfulness of predictions made by r based on the probabilistic reconstruction Gp i of Gi Example: livesin(x, Y ) marriedto(x, Z), livesin(z, Y ),not isa(x, surfer) Rule predictions: livesin(mat, monterey),livesin(dave, chicago) µ 2 (r, G i p)= Gi p(< dave livesin chicago >) 1 µ 2 (r, G i p) goes down for noisy exceptions 8 / 13

Evaluation Setup Datasets: FB15K: 592K facts, 15K entities and 1345 relations Wiki44K: 250K facts, 44K entities and 100 relations Training graph G: remove 20% from the available KG Embedding models G i p : TransE [Bordes et al., 2013], HolE [Nickel et al., 2016] With text: SSP [Xiao et al., 2017] Goals: Evaluate effectiveness of our hybrid rule measure µ(r, G i p) = (1 λ) µ 1 (r, G) + λ µ 2 (r, G i p) Compare against state-of-the-art rule learning systems 9 / 13

Avg. prec. Avg. score Avg. prec. Avg. prec. Evaluation of Hybrid Rule Measure top_5 top_10 top_20 top_50 top_100 top_200 1 1 1 1 0.95 0.9 0.85 0.8 0.75 0.95 0.9 0.85 0.8 0.95 0.9 0.85 0.8 0.75 0.9 0.8 0.7 0.6 0.5 0.4 0.75 0.7 0. 0 0. 2 0. 4 0. 6 0. 8 1. 0 λ λ λ 0.7 (b) Conf-SSP (c) PCA-SSP 0. 0 0. 2 0.4 0.6 0. 8 1. 0 (a) Conf-HolE 0.7 0.3 0. 0 0. 2 0. 4 0. 6 0. 8 1. 0 0. 0 0. 2 0. 4 0. 6 0. 8 1. 0 Precision of top-k rules ranked using λ variations of µ on FB15K. 10 / 13

Avg. prec. Avg. score Avg. prec. Avg. prec. Evaluation of Hybrid Rule Measure top_5 top_10 top_20 top_50 top_100 top_200 1 1 1 1 0.95 0.9 0.85 0.8 0.75 0.95 0.9 0.85 0.8 0.95 0.9 0.85 0.8 0.75 0.9 0.8 0.7 0.6 0.5 0.4 0.75 0.7 0. 0 0. 2 0. 4 0. 6 0. 8 1. 0 λ λ λ 0.7 (b) Conf-SSP (c) PCA-SSP 0. 0 0. 2 0.4 0.6 0. 8 1. 0 (a) Conf-HolE 0.7 0.3 0. 0 0. 2 0. 4 0. 6 0. 8 1. 0 0. 0 0. 2 0. 4 0. 6 0. 8 1. 0 Precision of top-k rules ranked using λ variations of µ on FB15K. Positive impact of embeddings in all cases for λ = 0.3 Note: in (c) comparison to AMIE [Galárraga et al., 2015] (λ = 0) 10 / 13

Example Rules Examples of rules learned from Wikidata Script writers stay the same throughout a sequel, but not for TV series r 1 : scriptwriterof (X, Y ) precededby(y, Z), scriptwriterof (X, Z), not isa(z, tvseries) Nobles are typically married to nobles, but not in the case of Chinese dynasties r 2 : noblefamily(x, Y ) spouse(x, Z), noblefamily(z, Y ), not isa(y,chinesedynasty) 11 / 13

Related Work Inductive logic programming [Muggleton et al, 1990] Data mining [Mannila, 1996] Learning from interpretations [De Raedt et al, 2014] Neural-based rule learning [Yang et al, 2017, Evans et al, 2018] Relational association rule learning [Goethals et al, 2002, Galárraga et al, 2015, ISWC 2016, ILP 2017] Interactive pattern mining [Goethals et al, 2011, Dzyuba et al, 2017] Exploiting cardinality info [ISWC 2017] Rule Learning guided by embedding models 12 / 13

Conclusion Summary: Framework for learning rules from KGs with external sources Hybrid embedding-based rule quality measure Experimental evaluation on real-world KGs Approach is orthogonal to a concrete embedding used Outlook: Other rule types, e.g., with existentials in the head or constraints Plug-in portfolio of embeddings Mimic framework of exact learning [Angluin, 1987] by establishing complex queries to embeddings 13 / 13

References I Dana Angluin. Queries and concept learning. Machine Learning, 2(4):319 342, 1987. Antoine Bordes, Nicolas Usunier, Alberto García-Durán, Jason Weston, and Oksana Yakhnenko. Translating Embeddings for Modeling Multi-relational Data. In Proceedings of NIPS, pages 2787 2795, 2013. Luis Galárraga, Christina Teflioudi, Katja Hose, and Fabian M. Suchanek. Fast rule mining in ontological knowledge bases with AMIE+. VLDB J., 24(6):707 730, 2015. Bart Goethals and Jan Van den Bussche. Relational association rules: Getting warmer. In PDD, 2002. Maximilian Nickel, Lorenzo Rosasco, and Tomaso A. Poggio. Holographic embeddings of knowledge graphs. In AAAI, 2016. Ehud Y. Shapiro. Inductive inference of theories from facts. In Computational Logic - Essays in Honor of Alan Robinson, pages 199 254, 1991. Zhigang Wang and Juan-Zi Li. Text-enhanced representation learning for knowledge graph. In IJCAI, 2016. Han Xiao, Minlie Huang, Lian Meng, and Xiaoyan Zhu. SSP: semantic space projection for knowledge graph embedding with text descriptions. In AAAI, 2017.