Yarra, C and Ramanathi, MK and Ghosh, PK (2019) Comparison of automatic syllable stress detection quality with time-aligned boundaries and context dependencies. In: 8th ISCA Workshop on Speech and Language Technology in Education, SLaTE 19, 20 - 21 September 2019, Graz, pp. 79-83.
PDF
8th ISCA_SLaTE 19_79-83_2019.pdf - Published Version Restricted to Registered users only Download (279kB) | Request a copy |
Abstract
Syllable stress is detected automatically using a classifier trained with stress labels and features computed based on acoustics within syllables. Typically, in real scenarios, syllable data is estimated considering an acoustic model (AM) and a lexicon. Thus, their quality affects the stress detection performance (accuracy). In this work, we analyse variations in the accuracies on ISLE corpus containing spoken English utterances from non-native speakers. In the analysis, we consider five AMs and five lexicons containing native English pronunciations augmented with different percentages of non-native pronunciations collected from the corpus. For each AM and lexicon combination, we estimate syllable data using two existing forced-alignment techniques and observe that the accuracies obtained with the features from both the data are comparable. Further, we propose a set of features based on context dependencies of the syllable nuclei. For all the combinations, the accuracies are higher when context based features are augmented with acoustic based features and the highest accuracy is obtained for the combination whose estimated syllable data has the least error. Among all five lexicons, the highest and the least accuracies for ITA & GER are obtained when the lexicons include all & none and none & all of the non-native pronunciations respectively.
Item Type: | Conference Poster |
---|---|
Publication: | 8th ISCA Workshop on Speech and Language Technology in Education, SLaTE 19 |
Publisher: | The International Society for Computers and Their Applications (ISCA) |
Additional Information: | The copyright for this article belongs to The International Society for Computers and Their Applications (ISCA). |
Keywords: | Feature extraction; Speech recognition, Acoustics model; Context dependency; Context dependent; Context dependent feature; Detection performance; Estimated syllable data; Non-native; Non-native speakers; Stress detection; Stress detection quality, Stresses |
Department/Centre: | Division of Electrical Sciences > Electrical Engineering |
Date Deposited: | 01 Dec 2022 06:59 |
Last Modified: | 01 Dec 2022 06:59 |
URI: | https://eprints.iisc.ac.in/id/eprint/78124 |
Actions (login required)
View Item |