ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Attention based hybrid I-vector BLSTM model for language recognition

Padi, B and Mohan, A and Ganapathy, S (2019) Attention based hybrid I-vector BLSTM model for language recognition. In: 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, 15 - 19 September 2019, Graz, pp. 1263-1267.

[img] PDF
INTERSPEECH_2019.pdf - Published Version
Restricted to Registered users only

Download (670kB) | Request a copy
Official URL: https://doi.org/10.21437/Interspeech.2019-2371

Abstract

In this paper, a hybrid i-vector neural network framework (i-BLSTM) which models the sequence information present in a series of short segment i-vectors for the task of spoken language recognition (LRE) is proposed. A sequence of short segment i-vectors are extracted for every speech utterance and are then modeled using a bidirectional long short-term memory (BLSTM) recurrent neural network (RNN). Attention mechanism inside the neural network relevantly weights segments of the speech utterance and the model learns to give higher weights to parts of speech data which are more helpful to the classification task. The proposed framework performs better in short duration and noisy environments when compared with the conventional i-vector system. Experiments are performed on clean, noisy and multi-speaker speech data from NIST LRE 2017 and RATS language recognition corpus. In these experiments, the proposed approach yields significant improvements (relative improvements of 7.6 - 13 in terms of accuracy for noisy conditions) over the conventional i-vector based language recognition approach and also over an end-to-end LSTM-RNN based approach.

Item Type: Conference Paper
Publication: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publisher: International Speech Communication Association
Additional Information: The copyright for this article belongs to International Speech Communication Association.
Keywords: Attention; LSTM; Short segment i-vectors; Spoken language recognition
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 05 Dec 2022 10:03
Last Modified: 05 Dec 2022 10:03
URI: https://eprints.iisc.ac.in/id/eprint/78256

Actions (login required)

View Item View Item