A Study on Robustness of Articulatory Features for Automatic Speech Recognition of Neutral and Whispered Speech

Srinivasan, G and Illa, A and Ghosh, PK (2019) A Study on Robustness of Articulatory Features for Automatic Speech Recognition of Neutral and Whispered Speech. In: 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019, 12 - 17 May 2019, Brighton, pp. 5936-5940.

PDF
ICASSP_2019.pdf - Published Version
Restricted to Registered users only
Download (13MB) | Request a copy

Official URL: https://doi.org/10.1109/ICASSP.2019.8683103

Abstract

Traditionally, automatic speech recognition (ASR) systems are trained on acoustic representations of neutral speech. As a result, their performance degrades when tested with whispered speech. In this work, we explore the robustness of articulatory features in ASR of neutral and whispered speech. We use acoustic, articulatory, and integrated acoustic and articulatory feature vectors in matched and mismatched train-test cases. The results suggest that the articulatory data is useful in ASR of both neutral and whispered speech, especially in the mismatched train-test cases. When we concatenate acoustic and articulatory feature vectors and deploy it to the mismatched train-test case where the model is trained with neutral speech and tested with whispered speech, a relative improvement in phone error rate of 27.2 is observed compared to when only acoustic features are used. This suggests that articulatory data contains information complementary to acoustic representations. A phone specific recognition error is also presented which illustrates phones where adding articulatory information gives maximum benefit.

Item Type:	Conference Paper
Publication:	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Publisher:	Institute of Electrical and Electronics Engineers Inc.
Additional Information:	The copyright for this article belongs to Institute of Electrical and Electronics Engineers Inc.
Keywords:	Audio signal processing; Speech; Speech communication; Telephone sets, Acoustic features; Articulatory data; Articulatory features; Articulatory informations; Automatic speech recognition; Automatic speech recognition system; Specific recognition; Whispered speech, Speech recognition
Department/Centre:	Division of Electrical Sciences > Electrical Engineering
Date Deposited:	15 Dec 2022 08:07
Last Modified:	15 Dec 2022 08:07
URI:	https://eprints.iisc.ac.in/id/eprint/78374

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India