ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Talker diarization in the wild: The case of child-centered daylong audio-recordings

Cristia, Alejandrina and Ganesh, Shobhana and Casillas, Marisa and Ganapathy, Sriram (2018) Talker diarization in the wild: The case of child-centered daylong audio-recordings. In: 19th Annual Conference of the International Speech Communication, INTERSPEECH 2018, 2-6th September, 2018, Hyderabad International Convention Centre (HICC)Hyderabad, pp. 2583-2587.

[img] PDF
interspeech 2018 (10).pdf - Published Version
Restricted to Registered users only

Download (2MB) | Request a copy
Official URL: https://dx.doi.org/10.21437/Interspeech.2018-2078


Speaker diarization (answering `who spoke when') is a widely researched subject within speech technology. Numerous experiments have been run on datasets built from broadcast news, meeting data, and call centers-the task sometimes appears close to being solved. Much less work has begun to tackle the hardest diarization task of all: spontaneous conversations in real-world settings. Such diarization would be particularly useful for studies of language acquisition, where researchers investigate the speech children produce and hear in their daily lives. In this paper, we study audio gathered with a recorder worn by small children as they went about their normal days. As a result, each child was exposed to different acoustic environments with a multitude of background noises and a varying number of adults and peers. The inconsistency of speech and noise within and across samples poses a challenging task for speaker diarization systems, which we tackled via retraining and data augmentation techniques. We further studied sources of structured variation across raw audio files, including the impact of speaker type distribution, proportion of speech from children, and child age on diarization performance. We discuss the extent to which these findings might generalize to other samples of speech in the wild.

Item Type: Conference Proceedings
Series.: Interspeech
Additional Information: 19th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2018), Hyderabad, INDIA, AUG 02-SEP 06, 2018
Keywords: speaker diarization; language acquisition; spontaneous speech; i-vectors
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 09 Jun 2020 05:55
Last Modified: 09 Jun 2020 05:55
URI: http://eprints.iisc.ac.in/id/eprint/62925

Actions (login required)

View Item View Item