ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help


Ryant, Neville and Bergelson, Elika and Church, Kenneth and Cristia, Alejandrina and Du, Jun and Ganapathy, Sriram and Khudanpur, Sanjeev and Kowalski, Diana and Krishnamoorthy, Mahesh and Kulshreshta, Rajat and Liberman, Mark and Lu, Yu-Ding and Maciejewski, Matthew and Metze, Florian and Profant, Jan and Sun, Lei and Tsao, Yu and Yu, Zhou (2018) ENHANCEMENT AND ANALYSIS OF CONVERSATIONAL SPEECH: JSALT 2017. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), APR 15-20, 2018, Calgary, CANADA, pp. 5154-5158.

[img] PDF
Ieee_Int_Con_Aco_Spe_Sig_Pro_5154_2018.pdf - Published Version
Restricted to Registered users only

Download (155kB) | Request a copy
Official URL: https://doi.org/10.1109/ICASSP.2018.8462468


Automatic speech recognition is more and more widely and effectively used. Nevertheless, in some automatic speech analysis tasks the state of the art is surprisingly poor. One of these is ``diarization'', the task of determining who spoke when. Diarization is key to processing meeting audio and clinical interviews, extended recordings such as police body cam or child language acquisition data, and any other speech data involving multiple speakers whose voices are not cleanly separated into individual channels. Overlapping speech, environmental noise and suboptimal recording techniques make the problem harder. During the JSALT Summer Workshop at CMU in 2017, an international team of researchers worked on several aspects of this problem, including calibration of the state of the art, detection of overlaps, enhancement of noisy recordings, and classification of shorter speech segments. This paper sketches the workshop's results, and announces plans for a ``Diarization Challenge'' to encourage further progress.

Item Type: Conference Proceedings
Publisher: IEEE
Additional Information: Copy right for this article belong to IEEE
Keywords: diarization; overlap detection; speech enhancement; automatic speech recognition
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 26 Oct 2018 14:44
Last Modified: 26 Oct 2018 14:44
URI: http://eprints.iisc.ac.in/id/eprint/60963

Actions (login required)

View Item View Item