ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Increasing the robustness of CNN acoustic models using autoregressive moving average spectrogram features and channel dropout

Kovacs, Gyorgy and Toth, Laszlo and Van Compernolle, Dirk and Ganapathy, Sriram (2017) Increasing the robustness of CNN acoustic models using autoregressive moving average spectrogram features and channel dropout. In: PATTERN RECOGNITION LETTERS, 100 . pp. 44-50.

[img] PDF
Pat_Rec_Let_100_44_2017.pdf - Published Version
Restricted to Registered users only

Download (781kB) | Request a copy
Official URL: http://dx.doi.org/ 10.1016/j.patrec.2017.09.023

Abstract

Developing automatic speech recognition systems that are robust to mismatched and noisy channel conditions is a challenging problem, especially when the training and the test conditions are different. Here, we seek to increase the robustness of convolutional neural network (CNN) acoustic models under such circumstances by combining two methods. Firstly, we propose an improved version of input dropout, which exploits the special structure of the input time-frequency representation. Instead of just dropping out random `pixels' of the spectrogram, the proposed channel dropout approach discards whole spectral channels. We expect that this dropout strategy will force the network to rely less on the whole spectrum, and make it more robust to channel mismatches and narrow-band noise. Secondly, we replaced the standard mel-spectrogram input representation with the autoregressive moving average (ARMA) spectrogram, which was recently shown to outperform the former under mismatched train-test conditions. In our experiments on the Aurora-4 database, the proposed channel dropout method attained relative word error rate reductions of 16% with ARMA features (an absolute improvement of 3%), and 20% with FBANK features (an absolute improvement of 7%) over the baseline CNN, when using the clean training scenario. (C) 2017 Elsevier B.V. All rights reserved.

Item Type: Journal Article
Publication: PATTERN RECOGNITION LETTERS
Publisher: 10.1016/j.patrec.2017.09.023
Additional Information: Copy right for this article belongs to the ELSEVIER SCIENCE BV, PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 13 Jan 2018 06:46
Last Modified: 13 Jan 2018 06:46
URI: http://eprints.iisc.ac.in/id/eprint/58591

Actions (login required)

View Item View Item