ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Whispered speech to neutral speech conversion using bidirectional LSTMs

Meenakshi, G Nisha and Ghosh, Prasanta Kumar (2018) Whispered speech to neutral speech conversion using bidirectional LSTMs. In: 19th Annual Conference of the International Speech Communication, 2-6 September, 2018, Hyderabad International Convention Centre (HICC)Hyderabad, pp. 491-495.

[img] PDF
Interspeech 2018.pdf - Published Version
Restricted to Registered users only

Download (344kB) | Request a copy
Official URL: https://dx.doi.org/10.21437/Interspeech.2018-1487


We propose a bidirectional long short-term memory (BLSTM) based whispered speech to neutral speech conversion system that employs the STRAIGHT speech synthesizer. We use a BLSTM to map the spectral features of whispered speech to those of neutral speech. Three other BLSTMs are employed to predict the pitch, periodicity levels and the voiced/unvoiced phoneme decisions from the spectral features of whispered speech. We use objective measures to quantify the quality of the predicted spectral features and excitation parameters, using data recorded from six subjects, in a four fold setup. We find that the temporal smoothness of the spectral features predicted using the proposed BLSTM based system is statistically more compared to that predicted using deep neural network based baseline schemes. We also observe that while the performance of the proposed system is comparable to the baseline scheme for pitch prediction, it is superior in terms of classifying voicing decisions and predicting periodicity levels. From subjective evaluation via listening test, we find that the proposed method is chosen as the best performing scheme 26.61% (absolute) more often than the best baseline scheme. This reveals that the proposed method yields a more natural sounding neutral speech from whispered speech.

Item Type: Conference Proceedings
Series.: Interspeech
Additional Information: 19th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2018), Hyderabad, INDIA, AUG 02-SEP 06, 2018
Keywords: Whispered speech; LSTM; STRAIGHT
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 16 Jul 2019 08:05
Last Modified: 16 Jul 2019 08:05
URI: http://eprints.iisc.ac.in/id/eprint/62913

Actions (login required)

View Item View Item