ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Zero-shot word sense disambiguation using sense definition embeddings

Kumar, S and Jat, S and Saxena, K and Talukdar, P (2020) Zero-shot word sense disambiguation using sense definition embeddings. In: ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, 28 July - 2 August 2019, Florence, pp. 5670-5681.

[img] PDF
ACL_2019.pdf - Published Version
Restricted to Registered users only

Download (458kB) | Request a copy
Official URL: https://doi.org/10.18653/v1/P19-1568

Abstract

Word Sense Disambiguation (WSD) is a longstanding but open problem in Natural Language Processing (NLP). WSD corpora are typically small in size, owing to an expensive annotation process. Current supervised WSD methods treat senses as discrete labels and also resort to predicting the Most-Frequent-Sense (MFS) for words unseen during training. This leads to poor performance on rare and unseen senses. To overcome this challenge, we propose Extended WSD Incorporating Sense Embeddings (EWISE), a supervised model to perform WSD by predicting over a continuous sense embedding space as opposed to a discrete label space. This allows EWISE to generalize over both seen and unseen senses, thus achieving generalized zero-shot learning. To obtain target sense embeddings, EWISE utilizes sense definitions. EWISE learns a novel sentence encoder for sense definitions by using WordNet relations and also ConvE, a recently proposed knowledge graph embedding method. We also compare EWISE against other sentence encoders pretrained on large corpora to generate definition embeddings. EWISE achieves new state-of-the-art WSD performance. © 2019 Association for Computational Linguistics

Item Type: Conference Paper
Publication: ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
Publisher: Association for Computational Linguistics (ACL)
Additional Information: The copyright for this article belongs to Association for Computational Linguistics (ACL).
Keywords: Computational linguistics; Embeddings; Signal encoding, Knowledge graphs; Label space; Large corpora; NAtural language processing; Poor performance; State of the art; Word-sense disambiguation; Wordnet, Natural language processing systems
Department/Centre: Division of Mechanical Sciences > Civil Engineering
Date Deposited: 08 Feb 2023 08:47
Last Modified: 08 Feb 2023 08:47
URI: https://eprints.iisc.ac.in/id/eprint/80063

Actions (login required)

View Item View Item