Adiga, Aniruddha and Magimai-Doss, Mathew and Seelamantula, Chandra Sekhar (2013) Gammatone Wavelet Cepstral Coefficients for Robust Speech Recognition. In: IEEE International Conference of Region 10 (TENCON), OCT 22-25, 2013, Xian, PEOPLES R CHINA.
PDF
ieee_int_con_rec_2013.pdf - Published Version Restricted to Registered users only Download (654kB) | Request a copy |
Abstract
We develop noise robust features using Gammatone wavelets derived from the popular Gammatone functions. These wavelets incorporate the characteristics of human peripheral auditory systems, in particular the spatially-varying frequency response of the basilar membrane. We refer to the new features as Gammatone Wavelet Cepstral Coefficients (GWCC). The procedure involved in extracting GWCC from a speech signal is similar to that of the conventional Mel-Frequency Cepstral Coefficients (MFCC) technique, with the difference being in the type of filterbank used. We replace the conventional mel filterbank in MFCC with a Gammatone wavelet filterbank, which we construct using Gammatone wavelets. We also explore the effect of Gammatone filterbank based features (Gammatone Cepstral Coefficients (GCC)) for robust speech recognition. On AURORA 2 database, a comparison of GWCCs and GCCs with MFCCs shows that Gammatone based features yield a better recognition performance at low SNRs.
Item Type: | Conference Proceedings |
---|---|
Series.: | TENCON IEEE Region 10 Conference Proceedings |
Publisher: | IEEE |
Additional Information: | copyright for this article belongs to IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA |
Keywords: | Gammatone wavelets; Auditory modeling; Cepstral coefficients; Speech recognition |
Department/Centre: | Division of Electrical Sciences > Electrical Engineering |
Date Deposited: | 09 Jun 2014 09:54 |
Last Modified: | 09 Jun 2014 09:54 |
URI: | http://eprints.iisc.ac.in/id/eprint/49235 |
Actions (login required)
View Item |