ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

LEAP submission for the third DIHARD diarization challenge

Singh, P and Varma, R and Krishnamohan, V and Chetupalli, SR and Ganapathy, S (2021) LEAP submission for the third DIHARD diarization challenge. In: 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021, 30 Aug - 03 Sep 2021, Brno, pp. 2538-2542.

[img] PDF
INTERSPEECH_2021.pdf - Published Version
Restricted to Registered users only

Download (249kB) | Request a copy
Official URL: https://doi.org/10.21437/Interspeech.2021-728


The LEAP submission for DIHARD-III challenge is described in this paper. The proposed system is composed of a speech bandwidth classifier, and diarization systems fine-tuned for narrowband and wideband speech separately. We use an end-to-end speaker diarization system for the narrowband conversational telephone speech recordings. For the wideband multi-speaker recordings, we use a neural embedding based clustering approach, similar to the baseline system. The embeddings are extracted from a time-delay neural network (called x-vectors) followed by the graph based path integral clustering (PIC) approach. The LEAP system showed 24 and 18 relative improvements for Track-1 and Track-2 respectively over the baseline system provided by the organizers. This paper describes the challenge submission, the post-evaluation analysis and improvements observed on the DIHARD-III dataset. Copyright © 2021 ISCA.

Item Type: Conference Paper
Publication: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publisher: International Speech Communication Association
Additional Information: The copyright for this article belongs to International Speech Communication Association
Keywords: Audio recordings; Graph theory; Graphic methods; Neural networks; Speech communication; Speech recognition, Clustering approach; Clusterings; Diarization; Embeddings; End-to-end systems; Narrow bands; Path integral; Path integral clustering; Speaker diarization; X-vector, Embeddings
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 03 Dec 2021 08:51
Last Modified: 03 Dec 2021 08:51
URI: http://eprints.iisc.ac.in/id/eprint/70642

Actions (login required)

View Item View Item