Singh, A and Shah, C and Varadaraj, R and Chauhan, S and Ghosh, PK (2023) SPIRE-SIES: A Spontaneous Indian English Speech Corpus. In: 26th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques, O-COCOSDA 2023, 4 December 2023 through 6 December 2023, Delhi.
PDF
pro_2023-26th_con_coc_int_com_coo_sta_spe_dat_ass_tec_2023 (2) - Published Version Restricted to Registered users only Download (825kB) | Request a copy |
Abstract
In this paper, we present a 170.83 hour Indian English spontaneous speech dataset. Lack of Indian English speech data is one of the major hindrances in developing robust speech systems which are adapted to the Indian speech style. Moreover this scarcity is even more for spontaneous speech. This corpus is crowd-sourced over varied Indian nativities, genders and age groups. Traditional spontaneous speech collection strategies involve capturing of speech during interviewing or conversations. In this study, we use images as stimuli to induce spontaneity in speech. Transcripts for 23 hours is generated and validated which can serve as a spontaneous speech ASR benchmark. Quality of the corpus is validated with voice activity detection based segmentation, gender verification and image semantic correlation. Which determines a relationship between image stimulus and recorded speech using caption keywords derived from Image-to-Text model and high occurring words derived from whisper ASR's generated transcripts. © 2023 IEEE.
Item Type: | Conference Paper |
---|---|
Publication: | Proceedings of 2023 26th Conference of the Oriental COCOSDA International Committee for the Co-Ordination and Standardization of Speech Databases and Assessment Techniques, O-COCOSDA 2023 |
Publisher: | Institute of Electrical and Electronics Engineers Inc. |
Additional Information: | The copyright for this article belongs to Institute of Electrical and Electronics Engineers Inc. |
Keywords: | Image segmentation; Speech recognition, Age groups; Collection strategies; Image stimulus; Indian accented english; Robust speech; Speech corpora; Speech data; Speech style; Speech systems; Spontaneous speech, Semantics |
Department/Centre: | Division of Electrical Sciences > Electrical Engineering |
Date Deposited: | 04 Sep 2024 06:06 |
Last Modified: | 04 Sep 2024 06:06 |
URI: | http://eprints.iisc.ac.in/id/eprint/84983 |
Actions (login required)
View Item |