ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Formant-gaps Features for Speaker Verification Using Whispered Speech

Naini, AR and Rao Mv, A and Ghosh, PK (2019) Formant-gaps Features for Speaker Verification Using Whispered Speech. In: 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019, 12 May 2019-17 May 2019, Brighton, pp. 6231-6235.

[img] PDF
ICASSP_2019.pdf - Published Version
Restricted to Registered users only

Download (14MB) | Request a copy
Official URL: https://doi.org/10.1109/ICASSP.2019.8682571

Abstract

In this work, we propose a new feature based on formants for whispered speaker verification (SV) task, where neutral data is used for enrollment and whispered recordings are used for test. Such a mismatch between enrollment and test often degrades the performance of whispered SV systems due to the difference in acoustic characteristics of whispered and neutral speech. We hypothesize that the proposed formant and formant gap (F oG) features are more invariant to the modes of speech in capturing speaker specific information compared to traditional baseline features for SV including mel frequency cepstral coefficients (MFCC) and auditory-inspired amplitude modulation features (AAMF). Whispered SV experiments with 714 speakers comprising 29232 neutral and 22932 whispered recordings reveal that the equal error rate (EER) using the proposed features is lower than that using the best baseline features by ~3.79 (absolute). It was also observed that at least four whispered recordings during enrollment are required for the baseline features to perform at par with the proposed features. However, it was found that the best performing baseline features yield an EER for neutral SV task which is ~1.88 higher than that using the proposed features.

Item Type: Conference Paper
Publication: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Publisher: Institute of Electrical and Electronics Engineers Inc.
Additional Information: The copyright for this article belongs to Institute of Electrical and Electronics Engineers Inc.
Keywords: Audio recordings; Speech; Speech communication; Speech recognition, Acoustic characteristic; Equal error rate; Feature-based; formants; Mel-frequency cepstral coefficients; Speaker specific informations; Speaker verification; Whispered speech, Audio signal processing
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 30 Nov 2022 08:49
Last Modified: 30 Nov 2022 08:49
URI: https://eprints.iisc.ac.in/id/eprint/78384

Actions (login required)

View Item View Item