ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Noise robust goodness of pronunciation measures using teacher's utterance

Sudhakara, S and Ramanathi, MK and Yarra, C and Das, A and Ghosh, PK (2019) Noise robust goodness of pronunciation measures using teacher's utterance. In: 8th ISCA Workshop on Speech and Language Technology in Education, SLaTE 19, 20 - 21 September 2019, Graz, pp. 69-73.

[img] PDF
8_isc_wor_69-73_2019.pdf - Published Version
Restricted to Registered users only

Download (532kB)
Official URL: https://doi.org/10.21437/SLaTE.2019-13

Abstract

In the applications of computer-aided pronunciation training (CAPT), evaluation of second language learner's pronunciation is an important task. For this task, goodness of pronunciation (GoP) is shown to be effective and is typically computed under clean speech conditions. However, in real scenarios, CAPT systems often need to deal with noisy conditions, which could degrade the effectiveness of GoP. We analyze the variations in GoP performance under noisy conditions by adding three types of noises namely, babble, white and f-16 at 20 dB, 10 dB and 0 dB signal-to-noise ratio (SNR) conditions. We hypothesize that the use of phonemes uttered by a teacher would make GoP score more robust and mimic the human rating closely, based on which we propose a modification to the typical lexicon based GoP (LGoP). The proposed scheme is referred as teacher utterance based GoP (TGoP). In addition, GoP of learner's and teacher's utterances are combined to propose a GoP like (GL) score based on the difference between the two. Correlation coefficient between the GoPs and the teacher's ratings is used as the performance metric. Experiments conducted on the speech data collected from Indian English learners reveal that, although the performance of different GoP schemes drops with additive noise, TGoP performs better than LGoP in both clean and noisy conditions. In low SNR conditions, GL performs better than both TGoP and LGoP. © SLaTE 2019. All rights reserved.

Item Type: Conference Paper
Publication: 8th ISCA Workshop on Speech and Language Technology in Education, SLaTE 19
Publisher: The International Society for Computers and Their Applications (ISCA)
Additional Information: The copyright for this article belongs to The International Society for Computers and Their Applications (ISCA).
Keywords: Computer aided analysis; Signal to noise ratio, Computer-aided pronunciation trainings; Condition; Goodness of pronunciation; Goodness of pronunciation like score; Lexicon based goodness of pronunciation; Lexicon-based; Noise analyse; Noise analyse for goodness of pronunciation; Teacher utterance based goodness of pronunciation; Teachers', Additive noise
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 29 Nov 2022 09:38
Last Modified: 29 Nov 2022 09:38
URI: https://eprints.iisc.ac.in/id/eprint/78112

Actions (login required)

View Item View Item