Sharma, N and Ganesh, S and Ganapathy, S and Holt, LL (2019) Analyzing Human Reaction Time for Talker Change Detection. In: 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019, 12 - 17 May 2019, Brighton, pp. 7135-7139.
PDF
ICASSP_2019.pdf - Published Version Restricted to Registered users only Download (2MB) | Request a copy |
Abstract
The ability to detect a change in the input is an essential aspect of perception. In speech communication, we use this ability to identify »talker changes» when listening to conversational speech (such as, audio podcasts). In this paper, we propose to improve our understanding about how fast listeners detect a change in talker, and the acoustic features tracked to identify a voice by designing a novel experimental paradigm. A listening experiment is designed in which listeners indicate the moment of perceived talker change in multi-talker speech utterances. We examine talker change detection performance by probing the human reaction time (RT). A random forest regression is used to model the relationship between RTs and acoustic features. The findings suggest that: (i) RT is less than a second, (ii) RT can be predicted from the difference in acoustic features of segment before and after change, and (iii) there a exists a significant dependence of RT on MFCC-D1 (delta MFCCs) features between segments of speech before and after the change instant. Further, a comparison with a machine system designed for the same task of TCD using speaker diarization principles showed a poor performance relative to the humans. © 2019 IEEE.
Item Type: | Conference Paper |
---|---|
Publication: | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
Publisher: | Institute of Electrical and Electronics Engineers Inc. |
Additional Information: | The copyright for this article belongs to Institute of Electrical and Electronics Engineers Inc. |
Keywords: | Audio signal processing; Decision trees; Speech; Speech analysis; Speech communication; Speech recognition, Acoustic features; Change detection; Conversational speech; Machine systems; Poor performance; Random forests; Speaker diarization; Speech utterance, Human reaction time |
Department/Centre: | Division of Electrical Sciences > Electrical Engineering Others |
Date Deposited: | 30 Nov 2022 09:33 |
Last Modified: | 30 Nov 2022 09:33 |
URI: | https://eprints.iisc.ac.in/id/eprint/78382 |
Actions (login required)
View Item |