ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

PSFM-A Probabilistic Source Filter Model for Noise Robust Glottal Closure Instant Detection

Rao, Achuth M and Ghosh, Prasanta Kumar (2018) PSFM-A Probabilistic Source Filter Model for Noise Robust Glottal Closure Instant Detection. In: IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 26 (9). pp. 1645-1657.

[img] PDF
Ieee_ACM_Tra_Aud_Lan_Pro_26-9_1645_2018.pdf - Published Version
Restricted to Registered users only

Download (869kB) | Request a copy
Official URL: https://dx.doi.org/10.1109/TASLP.2018.2834733

Abstract

Accurate estimation of glottal closure instant (GCI) enables several pitch synchronous speech analysis, such as prosody modifications, glottal inverse filtering, and study of pathological speech. We propose a probabilistic source-filter model (PSFM) for voiced speech, where the source is modeled using the Bernoulli Gaussian distribution, which models the GCI locations and the all-pole filter coefficients are modeled using Gaussian distribution. The probability of GCIs at each speech sample is estimated using the Gibbs sampling. We propose a cost to estimate the exact GCI locations using the N-best dynamic programming. A key feature of the proposed PSFM is that it allows us to include the second-order statistics of the noise for estimating the GCI locations, thereby resulting in a noise robust GCI detection technique, although it has high computational complexity. Evaluation on archivable priority list actual-word database (APLAWD) database shows the proposed algorithm performs at par with the state-of-the-art GCI detection method on clean speech. However, when evaluated in noisy conditions using five types of noises at six different signal-to-noise ratio (SNR) levels, we observe that the proposed method performs better than the best of the existing GCI detection scheme, particularly at low SNR condition indicating the noise robustness of the proposed method.

Item Type: Journal Article
Publication: IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
Publisher: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141 USA
Additional Information: Copyright of this article belong to IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141 USA
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 18 Jul 2018 15:18
Last Modified: 25 Aug 2022 11:59
URI: https://eprints.iisc.ac.in/id/eprint/60221

Actions (login required)

View Item View Item