ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

IITG- Indigo Submissions for NIST 2018 Speaker Recognition Evaluation and Post-Challenge Improvements

Singh, K and Kumar, N and Sinha, R and Ramoji, S and Ganapathy, S (2020) IITG- Indigo Submissions for NIST 2018 Speaker Recognition Evaluation and Post-Challenge Improvements. In: 26th National Conference on Communications NCC 2020, 21-23 Feb. 2020, Kharagpur, India, India.

[img] PDF
NAT_CON_COM_2020.pdf - Published Version
Restricted to Registered users only

Download (336kB) | Request a copy
Official URL: https://dx.doi.org/10.1109/NCC48643.2020.9056055

Abstract

This paper describes the submissions of team Indigo at Indian Institute of Technology Guwahati (IITG) to the NIST 2018 Speaker Recognition Evaluation (SRE18) challenge. These speaker verification (SV) systems are developed for the fixed training condition task in SRE18. The evaluation data in SRE18 is derived from two corpora: (i) Call My Net 2 (CMN2), and (ii) Video Annotation for Speech Technology (VAST). The VAST set is obtained by extracting audio from video having high musical/noisy background. Thus, it helps in assessing the robustness of the SV systems. A number of sub-systems are developed which differ in front-end modeling paradigms, backend classifiers, and suppression of repeating pattern in the data. The fusion of sub-systems is submitted as the primary system which achieved actual detection cost function (actDCF) and equal error rate (EER) of 0.77 and 13.79 , respectively, on the SRE18 evaluation data. Post-challenge efforts include the domain adaptation of the scores and the voice activity detection using deep neural network. With these enhancements, for the VAST trials, the best single sub-system achieves the relative reductions of 38.4 and 11.6 in actDCF and EER, respectively. © 2020 IEEE.

Item Type: Conference Paper
Publication: 26th National Conference on Communications, NCC 2020
Publisher: Institute of Electrical and Electronics Engineers Inc.
Additional Information: cited By 0; Conference of 26th National Conference on Communications, NCC 2020 ; Conference Date: 21 February 2020 Through 23 February 2020; Conference Code:159017
Keywords: Audio acoustics; Cost functions; Deep neural networks, Indian institute of technologies; Modeling paradigms; Relative reduction; Repeating patterns; Speaker recognition evaluations; Speaker verification; Training conditions; Voice activity detection, Speech recognition
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 07 Sep 2020 10:48
Last Modified: 07 Sep 2020 10:48
URI: http://eprints.iisc.ac.in/id/eprint/65304

Actions (login required)

View Item View Item