Dimitriadis, Dimitrios and Thomas, Samuel and Ganapathy, Sriram (2017) An Investigation on the Use of i-vectors for Robust ASR. In: 17th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2016), SEP 08-12, 2016, San Francisco, CA, pp. 3828-3832.
![]() |
PDF
UND_SPE_PRO_HUM_MAC_3828_2017.pdf - Published Version Restricted to Registered users only Download (252kB) | Request a copy |
Abstract
In this paper we propose two different i-vector representations that improve the noise robustness of automatic speech recognition (ASR). The first kind of i-vectors is derived from ``noise only'' components of speech provided by an adaptive denoising algorithm, the second variant is extracted from mel filterbank energies containing both speech and noise. The effectiveness of both these representations is shown by combining them with two different kinds of spectral features - the commonly used log-mel filterbank energies and Teager energy spectral coefficients (TESCs). Using two different DNN architectures for acoustic modeling - a standard state-of-the-art sigmoid-based DNN and an advanced architecture using leaky ReLUs, dropout and resealing, we demonstrate the benefit of the proposed representations. On the Aurora-4 multi-condition training task the proposed front-end improves ASR performance by 4%.
Item Type: | Conference Paper |
---|---|
Series.: | Interspeech |
Additional Information: | Copy right for this article belongs to the ISCA-INT SPEECH COMMUNICATION ASSOC, C/O EMMANUELLE FOXONET, 4 RUE DES FAUVETTES, LIEU DIT LOUS TOURILS, BAIXAS, F-66390, FRANCE |
Department/Centre: | Division of Electrical Sciences > Electrical Engineering |
Date Deposited: | 30 Oct 2017 03:39 |
Last Modified: | 30 Oct 2017 03:39 |
URI: | http://eprints.iisc.ac.in/id/eprint/58129 |
Actions (login required)
![]() |
View Item |