ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Noise robust pitch stylization using minimum mean absolute error criterion

Yarra, C and Ghosh, PK (2021) Noise robust pitch stylization using minimum mean absolute error criterion. In: 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021, 30 Aug - 03 Sep 2021, Brno, pp. 3121-3125.

[img] PDF
INTERSPEECH_2021.pdf - Published Version
Restricted to Registered users only

Download (279kB) | Request a copy
Official URL: https://doi.org/10.21437/Interspeech.2021-1307

Abstract

We propose a pitch stylization technique in the presence of pitch halving and doubling errors. The technique uses an optimization criterion based on a minimum mean absolute error to make the stylization robust to such pitch estimation errors, particularly under noisy conditions. We obtain segments for the stylization automatically using dynamic programming. Experiments are performed at the frame level and the syllable level. At the frame level, the closeness of stylized pitch is analyzed with the ground truth pitch, which is obtained using a laryngograph signal, considering root mean square error (RMSE) measure. At the syllable level, the effectiveness of perceptual relevant embeddings in the stylized pitch is analyzed by estimating syllabic tones and comparing those with manual tone markings using the Levenshtein distance measure. The proposed approach performs better than a minimum mean squared error criterion based pitch stylization scheme at the frame level and a knowledge-based tone estimation scheme at the syllable level under clean and 20dB, 10dB and 0dB SNR conditions with five noises and four pitch estimation techniques. Among all the combinations of SNR, noise and pitch estimation techniques, the highest absolute RMSE and mean distance improvements are found to be 6.49Hz and 0.23, respectively. Copyright © 2021 ISCA.

Item Type: Conference Paper
Publication: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publisher: International Speech Communication Association
Additional Information: The copyright for this article belongs to International Speech Communication Association
Keywords: Acoustic variables measurement; Continuous speech recognition; Errors; Knowledge based systems; Mean square error; Signal to noise ratio; Speech communication, Dynamic programming based segmentation; Estimation techniques; MAE criterion; Mean absolute error; Minimum MAE criteria; Noise robust; Noise robustness; Pitch estimation; Pitch stylizations; Root mean square errors, Dynamic programming
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 03 Dec 2021 08:50
Last Modified: 03 Dec 2021 08:50
URI: http://eprints.iisc.ac.in/id/eprint/70637

Actions (login required)

View Item View Item