Yarra, Chiranjeevi and Nagesh, Supriya and Deshmukh, Om D and Ghosh, Prasanta Kumar (2019) Noise robust speech rate estimation using signal-to-noise ratio dependent sub-band selection and peak detection strategy. In: JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 146 (3). pp. 1615-1628.
PDF
J_Aco_Soc_Ame_146-3_1615.pdf - Published Version Restricted to Registered users only Download (1MB) | Request a copy |
Abstract
Speech (syllable) rate estimation typically involves computing a feature contour based on sub-band energies having strong local maxima/peaks at syllable nuclei, which are detected with the help of voicing decisions (VDs). While such a two-stage scheme works well in clean conditions, the estimated speech rate becomes less accurate in noisy condition particularly due to erroneous VDs and non-informative sub-bands mainly at low signal-to-noise ratios (SNR). This work proposes a technique to use VDs in the peak detection strategy in an SNR dependent manner. It also proposes a data-driven sub-band pruning technique to improve syllabic peaks of the feature contour in the presence of noise. Further, this paper generalizes both the peak detection and the sub-band pruning technique for unknown noise and/or unknown SNR conditions. Experiments are performed in clean and 20, 10, and 0 dB SNR conditions separately using Switchboard, TIMIT, and CTIMIT corpora under five additive noises: white, car, high-frequency-channel, cockpit, and babble. Experiments are also carried out in test conditions at unseen SNRs of -5 and 5 dB with four unseen additive noises: factory, sub-way, street, and exhibition. The proposed method outperforms the best of the existing techniques in clean and noisy conditions for three corpora.
Item Type: | Journal Article |
---|---|
Publication: | JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA |
Publisher: | ACOUSTICAL SOC AMER AMER INST PHYSICS |
Additional Information: | Copyright of this article belong to ACOUSTICAL SOC AMER AMER INST PHYSICS |
Keywords: | 2ND-LANGUAGE LEARNERS FLUENCY; SYLLABLE NUCLEI; QUANTITATIVE ASSESSMENT; LOW-COMPLEXITY; SPEAKING RATE; RECOGNITION; TRACKING; DATABASE; READ |
Department/Centre: | Division of Electrical Sciences > Electrical Engineering |
Date Deposited: | 04 Dec 2019 10:43 |
Last Modified: | 04 Dec 2019 10:43 |
URI: | http://eprints.iisc.ac.in/id/eprint/63863 |
Actions (login required)
View Item |