ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Relative occurrences and difference of extrema for detection of transitions between broad phonetic classes

Ananthapadmanabha, T and Girish, K V Vijay and Ramakrishnan, A G (2018) Relative occurrences and difference of extrema for detection of transitions between broad phonetic classes. In: SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 43 (10).

[img] PDF
Sad_43_10_2018.pdf - Published Version
Restricted to Registered users only

Download (1MB) | Request a copy
Official URL: https://dx.doi.org/10.1007/s12046-018-0923-x

Abstract

Detection of transitions between broad phonetic classes in a speech signal has applications such as landmark detection and segmentation. The proposed hierarchical method detects silence to non-silence transitions, sonorant to non-sonorant transitions and vice-versa. The subset of the extrema (minimum or maximum amplitude samples) above a threshold, occurring between every pair of successive zero-crossings, is selected from each frame of the bandpass-filtered speech signal. Locations of the first and the last extrema lie on either side far away from the mid-point (reference) of a frame, if the speech signal belongs to a non-transition segment; else, one of these locations lies within a few samples from the reference, indicating a transition frame. The transitions are detected from the entire TIMIT database for clean speech and 93.6% of them are within a tolerance of 20 ms from the phone boundaries. Sonorant, unvoiced non-sonorant and silence classes and their respective onsets are detected with an accuracy of about 83.5% for the same tolerance with respect to the labelled TIMIT database as reference. The results are as good as, and in some aspects better than, the state-of-the-art methods for similar tasks. The proposed method is also tested on the test set of the TIMIT database for robustness with respect to white, babble and Schroeder noise, and about 90% of the transitions are detected within a tolerance of 20 ms at the signal to noise ratio of 5 dB. On NTIMIT database, 62.7% of the transitions are detected, and 63.5% of the sonorant onsets, within 20 ms tolerance.

Item Type: Journal Article
Additional Information: Copyright of this article belong to SPRINGER INDIA, 7TH FLOOR, VIJAYA BUILDING, 17, BARAKHAMBA ROAD, NEW DELHI, 110 001, INDIA
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Depositing User: Id for Latest eprints
Date Deposited: 20 Aug 2018 14:47
Last Modified: 06 Oct 2018 13:54
URI: http://eprints.iisc.ac.in/id/eprint/60461

Actions (login required)

View Item View Item