ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

A Spectro-temporal Technique for Estimating Aperiodicity and Voiced/unvoiced Decision Boundaries of Speech Signals

Dhiman, JK and Seelamantula, CS (2019) A Spectro-temporal Technique for Estimating Aperiodicity and Voiced/unvoiced Decision Boundaries of Speech Signals. In: 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019, 12 May 2019through 17 May 2019, Brighton, pp. 6510-6514.

[img] PDF
ICASSP_2019.pdf - Published Version
Restricted to Registered users only

Download (13MB) | Request a copy
Official URL: https://doi.org/10.1109/ICASSP.2019.8682385

Abstract

In contrast to a 1-D short-time analysis of speech, 2-D approaches aim at characterizing the speech signal attributes jointly in time and frequency. In this paper, we focus on the quasi-periodicity of a voiced spectro-temporal patch and quantify it by proposing an aperiodicity measure defined using the underlying frequency modulations in the patch. We further propose a time-frequency aperiodicity map obtained by overlapping and adding the aperiodicity measures across patches. The proposed aperiodicity map is utilized to obtain band-wise aperiodicity parameters, which are essential for high-quality speech synthesis. The aperiodicity in unvoiced patches is addressed by identifying them using the coherence of the patch. In addition, the proposed technique also provides voiced/unvoiced decisions boundaries of a speech signal. The effectiveness of the proposed band-wise aperiodicity parameters and voiced/unvoiced decisions is verified by incorporating them in an existing state-of-the-art vocoder for speech synthesis. Subjective listening tests show that the quality of the reconstructed speech is on par with that of the state-of-the-art WORLD vocoder in terms of mean opinion score, indicating that spectrotemporal approaches are highly promising for speech analysis and synthesis applications.

Item Type: Conference Paper
Publication: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Publisher: Institute of Electrical and Electronics Engineers Inc.
Additional Information: The copyright for this article belongs to Institute of Electrical and Electronics Engineers Inc.
Keywords: Quality control; Speech communication; Speech synthesis; Vocoders, Analysis and synthesis; Aperiodicity; band-wise aperiodicy parameters; Coherence maps; High quality speech synthesis; Spectrograms; Subjective listening test; Time and frequencies, Audio signal processing
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 15 Dec 2022 08:17
Last Modified: 15 Dec 2022 08:17
URI: https://eprints.iisc.ac.in/id/eprint/78388

Actions (login required)

View Item View Item