A Convolutional Neural Network-based

Similar documents
Modeling Topics and Knowledge Bases with Embeddings

Interpretable and Compositional Relation Learning by Joint Training with an Autoencoder

Analogical Inference for Multi-Relational Embeddings

ParaGraphE: A Library for Parallel Knowledge Graph Embedding

Semantics with Dense Vectors. Reference: D. Jurafsky and J. Martin, Speech and Language Processing

arxiv: v2 [cs.cl] 28 Sep 2015

Learning Entity and Relation Embeddings for Knowledge Graph Completion

A Translation-Based Knowledge Graph Embedding Preserving Logical Property of Relations

Locally Adaptive Translation for Knowledge Graph Embedding

Word Embeddings 2 - Class Discussions

CORE: Context-Aware Open Relation Extraction with Factorization Machines. Fabio Petroni

A Novel Embedding Model for Knowledge Base Completion Based on Convolutional Neural Network

Supplementary Material: Towards Understanding the Geometry of Knowledge Graph Embeddings

2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,

Deep Learning for Natural Language Processing. Sidharth Mudgal April 4, 2017

Online Videos FERPA. Sign waiver or sit on the sides or in the back. Off camera question time before and after lecture. Questions?

Knowledge Graph Completion with Adaptive Sparse Transfer Matrix

Embedding-Based Techniques MATRICES, TENSORS, AND NEURAL NETWORKS

Combining Two And Three-Way Embeddings Models for Link Prediction in Knowledge Bases

Web-Mining Agents. Multi-Relational Latent Semantic Analysis. Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme

Homework 3 COMS 4705 Fall 2017 Prof. Kathleen McKeown

Learning Multi-Relational Semantics Using Neural-Embedding Models

Learning Embedding Representations for Knowledge Inference on Imperfect and Incomplete Repositories

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

Primal Dual Algorithms for Halfspace Depth

Deep Convolutional Neural Networks for Pairwise Causality

Neighborhood Mixture Model for Knowledge Base Completion

DISTRIBUTIONAL SEMANTICS

Deep Learning Basics Lecture 10: Neural Language Models. Princeton University COS 495 Instructor: Yingyu Liang

Deep Learning for NLP

Learning to Represent Knowledge Graphs with Gaussian Embedding

APPLIED DEEP LEARNING PROF ALEXIEI DINGLI

Differentiable Learning of Logical Rules for Knowledge Base Reasoning

21st Century Global Learning

ATASS: Word Embeddings

Social Studies Tools (Maps & Sources) Test Study Guide

arxiv: v2 [cs.cl] 1 Jan 2019

Multiclass Classification-1

TransT: Type-based Multiple Embedding Representations for Knowledge Graph Completion

Machine Learning Basics

Multi-Task Structured Prediction for Entity Analysis: Search Based Learning Algorithms

Classification of Ordinal Data Using Neural Networks

Jointly Embedding Knowledge Graphs and Logical Rules

Tuning as Linear Regression

TransG : A Generative Model for Knowledge Graph Embedding

Lecture 6: Neural Networks for Representing Word Meaning

Data Mining Techniques

Final Examination CS540-2: Introduction to Artificial Intelligence

Large-Margin Thresholded Ensembles for Ordinal Regression

Knowledge Graph Embedding with Diversity of Structures

Final Examination CS 540-2: Introduction to Artificial Intelligence

Sparse vectors recap. ANLP Lecture 22 Lexical Semantics with Dense Vectors. Before density, another approach to normalisation.

ANLP Lecture 22 Lexical Semantics with Dense Vectors

Knowledge Graph Embedding via Dynamic Mapping Matrix

STransE: a novel embedding model of entities and relationships in knowledge bases

Regularization Introduction to Machine Learning. Matt Gormley Lecture 10 Feb. 19, 2018

Sound Recognition in Mixtures

Analyzing the Performance of Multilayer Neural Networks for Object Recognition

PROBABILISTIC KNOWLEDGE GRAPH CONSTRUCTION: COMPOSITIONAL AND INCREMENTAL APPROACHES. Dongwoo Kim with Lexing Xie and Cheng Soon Ong CIKM 2016

A practical theory for designing very deep convolutional neural networks. Xudong Cao.

Learning Features from Co-occurrences: A Theoretical Analysis

Decoupled Collaborative Ranking

Discriminative Training

Machine Learning with Quantum-Inspired Tensor Networks

Fielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of Data

A fast and simple algorithm for training neural probabilistic language models

Natural Language Processing with Deep Learning CS224N/Ling284

From perceptrons to word embeddings. Simon Šuster University of Groningen

PROBABILISTIC LATENT SEMANTIC ANALYSIS

arxiv: v1 [cs.ai] 12 Nov 2018

Large-Margin Thresholded Ensembles for Ordinal Regression

EMEA Rents and Yields MarketView

Generative Models for Sentences

Neural Word Embeddings from Scratch

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov

REVISITING KNOWLEDGE BASE EMBEDDING AS TEN-

CLRG Biocreative V

Topic Models and Applications to Short Documents

Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers

EE-559 Deep learning Recurrent Neural Networks

Fantope Regularization in Metric Learning

GloVe: Global Vectors for Word Representation 1

An overview of word2vec

RAGAV VENKATESAN VIJETHA GATUPALLI BAOXIN LI NEURAL DATASET GENERALITY

Discriminative Training. March 4, 2014

Class 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio

Generating Sentences by Editing Prototypes

International Student Enrollment Fall 2018 By CIP Code, Country of Citizenship, and Education Level Harpur College of Arts and Sciences

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views

Channel Pruning and Other Methods for Compressing CNN

Applied Natural Language Processing

Structured deep models: Deep learning on graphs and beyond

Intelligent Systems (AI-2)

Natural Language Processing

arxiv: v1 [cs.lg] 3 Jan 2017

Towards Understanding the Geometry of Knowledge Graph Embeddings

Benchmarking Functional Link Expansions for Audio Classification Tasks

CIKM 18, October 22-26, 2018, Torino, Italy

Mathematical Tools for Neuroscience (NEU 314) Princeton University, Spring 2016 Jonathan Pillow. Homework 8: Logistic Regression & Information Theory

Task-Oriented Dialogue System (Young, 2000)

Transcription:

A Convolutional Neural Network-based Model for Knowledge Base Completion Dat Quoc Nguyen Joint work with: Dai Quoc Nguyen, Tu Dinh Nguyen and Dinh Phung April 16, 2018

Introduction Word vectors learned from a large corpus can model relational similarities or linguistic regularities between pairs of words as translations in the projected vector space: Moscow Tokyo Ankara Warsaw Berlin Paris Rome Athens Madrid Lisbon Russia Japan Turkey Poland Germany France Greece Italy Spain Portugal A royal relationship between king and man, and between queen and woman

Introduction Moscow Tokyo Ankara Russia Japan Turkey Let consider the country and capital pairs to be pairs of entities rather than word types We represent country and capital entities by low-dimensional and dense vectors Lisbon The relational similarity between word pairs is presumably to capture a is capital of relationship between country and capital entities We also represent this relationship by a translation vector in the entity vector space Berlin Paris Rome Athens Madrid Warsaw Poland Germany France Greece Italy Spain Portugal

Introduction Moscow Tokyo Ankara Russia Japan Turkey This intuition inspired the TransE model a well-known embedding model for KB completion or link prediction in KBs (Bordes et al., 2013) Lisbon Berlin Paris Rome Athens Madrid Warsaw Poland Germany France Greece Italy Spain Portugal

Introduction KBs are collections of real-world triples, where each triple or fact (h, r, t) represents some relation r between a head entity h and a tail entity t Entities are real-world things or objects such as persons, places, organizations, music tracks or movies Each relation type defines a certain relationship between entities E.g., the relation type child of relates person entities with each other, while the relation type born in relates person entities with place entities KBs thus are useful resources for many NLP tasks

Introduction Issue: KBs are far from complete In English DBpedia 2014, 60% of person entities miss a place of birth and 58% of the scientists do not have a fact about what they are known for (Krompaß et al., 2015) In Freebase, 71% of 3 million person entities miss a place of birth, 75% do not have a nationality while 94% have no facts about their parents (West et al., 2014) So, a question answering application based on an incomplete KB would not provide a correct answer given a correctly interpreted question It would be impossible to answer the question where was Jane born?

Introduction KB completion or Link prediction: Predict whether a relationship/triple not in the KB is likely to be true, i.e., to add new triples by leveraging existing triples in the KB E.g., Predict the missing tail entity in the incomplete triple (Jane, born in,?) or predict whether the triple (Jane, born in, Miami) is correct or not

Embedding models for KB completion Embedding models for KB completion have been proven to give state-of-theart link prediction performances Entities are represented by latent feature vectors Relation types are represented by latent feature vectors, matrices or third-order tensors Lisbon Berlin Paris Rome Athens Madrid Warsaw Moscow Tokyo Ankara Russia Japan Turkey Poland Germany France Greece Italy Spain Portugal

Embedding models for KB completion For each triple (h, r, t), the embedding models define a score function f(h; r; t) of its implausibility: Choose f such that the score f(h, r, t) of a correct triple (h, r, t) is smaller than the score f(h, r, t ) of an incorrect triple f(h, r, t ) TransE: Lisbon Berlin Paris Rome Athens Madrid Warsaw Moscow Tokyo Ankara Russia Japan Turkey Poland Germany France Greece Italy Spain Portugal

Embedding models for KB completion

Embedding models for KB completion A common objective function is the following margin-based function: is the margin hyper-parameter is the set of incorrect triples generated by corrupting the correct triple

A CNN-based model for KB completion Each embedding triple are viewed as a 3-column matrix Each filter is repeatedly operated over every row of A to generate a feature map convolution ReLU matrix 4 3 3 filters 1 3 3 feature maps are concatenated dot product score with Let and denote the set of filters and a convolution operator, respectively Our ConvKB formally defines a score function as k = 4 h r t

A CNN-based model for KB completion ConvKB formally defines a score function as convolution ReLU 3 feature maps are concatenated If we only use one filter with! be the vector norm and fix during training, then ConvKB reduces to the plain TransE ConvKB model can be viewed as an extension of TransE k = 4 matrix 4 3 3 filters 1 3 h r t dot product score

KB completion experiments On both FB13 and WN11, each validation and test set also contains the same number of incorrect triples as the number of correct triples Link prediction task: Predict h given (?, r, t) or predict t given (h, r,?) where? denotes the missing element Corrupt each correct test triple (h, r, t) by replacing either h or t by each of the possible entities Rank these candidates by their implausibility value Metrics: mean rank (MR), mean reciprocal rank (MRR), and Hits@10 (i.e., the proportion of the valid test triples ranking in top 10 predictions)

KB completion experiments Link prediction results:

KB completion experiments Hits@10 on FB15k-237:

KB completion experiments Hits@10 on WN18RR:

KB completion experiments Triple classification task: Predict whether a triple (h, r, t) is correct or not Set a relation-specific threshold! for each relation type " For an unseen test triple (h, r, t), if f(h, r, t) is smaller than! then the triple will be classified as correct, otherwise incorrect Relation-specific thresholds are determined by maximizing the microaveraged accuracy on the validation set

KB completion experiments Triple classification results on FB13

Conclusion We have proposed an embedding model ConvKB for knowledge base completion ConvKB outperforms previous state-of-the-art models: On two benchmark link prediction datasets WN18RR and FB15k-237 On two benchmark triple classification datasets WN11 and FB13 Code: https://github.com/daiquocnguyen/convkb References: Dai Quoc Nguyen, Tu Dinh Nguyen, Dat Quoc Nguyen and Dinh Phung. 2018. A Novel Embedding Model for Knowledge Base Completion Based on Convolutional Neural Network. In Proceedings of NAACL-HLT 2018, to appear. https://arxiv.org/abs/1712.02121 Dat Quoc Nguyen. 2017. An overview of embedding models of entities and relationships for knowledge base completion. https://arxiv.org/abs/1703.08098