ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information

Vashishth, Shikhar and Jain, Prince and Talukdar, Partha (2018) CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information. In: 27th World Wide Web (WWW) Conference, APR 23-27, 2018, Lyon, FRANCE, pp. 1317-1327.

[img] PDF
Pro_The_Wor_Wid_Web_2018.pdf - Published Version
Restricted to Registered users only

Download (1MB) | Request a copy
Official URL: http://dx.doi.org/10.1145/3178876.3186030

Abstract

Open Information Extraction (OpenIE) methods extract (noun phrase, relation phrase, noun phrase) triples from text, resulting in the construction of large Open Knowledge Bases (Open KBs). The noun phrases (NPs) and relation phrases in such Open KBs are not canonicalized, leading to the storage of redundant and ambiguous facts. Recent research has posed canonicalization of Open KBs as clustering over manually defined feature spaces. Manual feature engineering is expensive and often sub-optimal. In order to overcome this challenge, we propose Canonicalization using Embeddings and Side Information (CESI) a novel approach which performs canonicalization over learned embeddings of Open KBs. CESI extends recent advances in KB embedding by incorporating relevant NP and relation phrase side information in a principled manner. Through extensive experiments on multiple real-world datasets, we demonstrate CESI's effectiveness.

Item Type: Conference Paper
Publisher: ASSOC COMPUTING MACHINERY
Additional Information: Copyright for this article belongs to ASSOC COMPUTING MACHINERY
Keywords: Canonicalization; Knowledge Graphs; Knowledge Graph Embeddings; Open Knowledge Bases
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 03 Jun 2019 11:45
Last Modified: 25 Aug 2022 08:35
URI: https://eprints.iisc.ac.in/id/eprint/62385

Actions (login required)

View Item View Item