ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Overview of the HASOC Subtrack at FIRE 2021: Conversational Hate Speech Detection in Code-mixed language

Satapara, S and Modha, S and Mandl, T and Madhu, H and Majumder, P (2021) Overview of the HASOC Subtrack at FIRE 2021: Conversational Hate Speech Detection in Code-mixed language. In: Working Notes of FIRE - 13th Forum for Information Retrieval Evaluation, FIRE-WN 2021, 13 - 17 December 2021, Gandhinagar, pp. 20-31.

[img] PDF
FIRE-WN 2021_3159_20-31_2021 .pdf - Published Version
Restricted to Registered users only

Download (409kB) | Request a copy
Official URL: http://ceur-ws.org/Vol-3159/T1-2.pdf

Abstract

This paper presents an overview of the newly developed subtask offered at the Forum for Information Retrieval (FIRE’21) conference on detecting contextual hate in social media conversational dialogue. Identification of Conversational Hate-Speech in Code-Mixed Languages (ICHCL) is offered as subtask-2 of the HASOC-English and Indo-Aryan Languages subtrack under the HASOC main track. The objective of the ICHCL subtask is to filter posts that are normal on a standalone basis but might be judged as hate, profane and offensive posts if we consider the context. This subtask focused on the binary classification of such contextual posts. The dataset is sampled from Twitter. Around 7000 code-mixed posts in English and Hindi were downloaded and annotated with an annotation platform developed for this task. A total of 15 teams from across the world has participated and submitted 50 runs for this track. The Macro F1 score is used as the primary metric for the evaluation. The best-performing team has reported a macro-f1 score of around 0.74. The task shows that considering the context can improve the performance of classification methods. ICHCL can contribute to identifying the best methods for this task.

Item Type: Conference Paper
Publication: CEUR Workshop Proceedings
Publisher: CEUR-WS
Additional Information: The copyright for this article belongs to the CEUR-WS.
Keywords: Codes (symbols); Information retrieval; Social networking (online); Speech recognition, Binary classification; Classification methods; Context; F1 scores; Hate speech; Performance; Social media; Speech detection; Subtask, Fires
Department/Centre: Division of Electrical Sciences > Electronic Systems Engineering (Formerly Centre for Electronic Design & Technology)
Date Deposited: 04 Aug 2022 08:56
Last Modified: 04 Aug 2022 08:56
URI: https://eprints.iisc.ac.in/id/eprint/75284

Actions (login required)

View Item View Item