Kohli, A and Devi, VS (2024) Explainable Offensive Language Classifier. In: Communications in Computer and Information Science, 1962 C (1). pp. 299-313.
|
PDF
acs_ome_9_1_2024.pdf - Published Version Download (9MB) | Preview |
Abstract
Offensive content in social media has become a serious issue, due to which its automatic detection is a crucial task. Deep learning approaches for Natural Language Processing (or NLP) have proven to be on or even above human-level accuracy for offensive language detection tasks. Due to this, the deployment of deep learning models for these tasks is justified. However, there is one key aspect that these models lack, which is explainability, in contrast to humans. In this paper, we provide an explainable model for offensive language detection in the case of multi-task learning. Our model achieved an F1 score of 0.78 on the OLID dataset and 0.85 on the SOLID dataset. We also provide a detailed analysis of the model interpretability. © 2024, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Item Type: | Journal Article |
---|---|
Publication: | Communications in Computer and Information Science |
Publisher: | Springer Science and Business Media Deutschland GmbH |
Additional Information: | The copyright for this article belongs to Author |
Keywords: | Deep learning; Learning algorithms; Learning systems, Automatic Detection; Deep learning; Explainability and interpretability; Interpretability; Language detection; Language processing; Learning approach; Natural languages; Offensive languages; Social media, Natural language processing systems |
Department/Centre: | Division of Biological Sciences > Biochemistry |
Date Deposited: | 28 Feb 2024 13:28 |
Last Modified: | 28 Feb 2024 13:28 |
URI: | https://eprints.iisc.ac.in/id/eprint/83711 |
Actions (login required)
View Item |