ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Context dependent RNNLM for automatic transcription of conversations

Chetupalli, SR and Ganapathy, S (2020) Context dependent RNNLM for automatic transcription of conversations. In: 21st Annual Conference of the International Speech Communication Association, INTERSPEECH, 25-29, October 2020, Shanghai; China;, pp. 886-890.

INT-Vol-2020-Cont.pdf - Published Version

Download (319kB) | Preview
Official URL: https://dx.doi.org/10.21437/Interspeech.2020-1813


Conversational speech, while being unstructured at an utterance level, typically has a macro topic which provides larger context spanning multiple utterances. The current language models in speech recognition systems using recurrent neural networks (RNNLM) rely mainly on the local context and exclude the larger context. In order to model the long term dependencies of words across multiple sentences, we propose a novel architecture where the words from prior utterances are converted to an embedding. The relevance of these embeddings for the prediction of next word in the current sentence is found using a gating network. The relevance weighted context embedding vector is combined in the language model to improve the next word prediction, and the entire model including the context embedding and the relevance weighting layers is jointly learned for a conversational language modeling task. Experiments are performed on two conversational datasets - AMI corpus and the Switchboard corpus. In these tasks, we illustrate that the proposed approach yields significant improvements in language model perplexity over the RNNLM baseline. In addition, the use of proposed conversational LM for ASR rescoring results in absolute WER reduction of 1.2 on Switchboard dataset and 1.0 on AMI dataset over the RNNLM based ASR baseline. Copyright © 2020 ISCA

Item Type: Conference Paper
Publication: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publisher: International Speech Communication Association
Additional Information: cited By 0; Conference of 21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 ; Conference Date: 25 October 2020 Through 29 October 2020; Conference Code:165507
Keywords: Computational linguistics; Electric switchboards; Embeddings; Modeling languages; Speech communication; Speech recognition, Automatic transcription; Context dependent; Conversational speech; Long-term dependencies; Novel architecture; Speech recognition systems; Switchboard corpus; Word prediction, Recurrent neural networks
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 12 Jan 2021 06:27
Last Modified: 12 Jan 2021 06:27
URI: http://eprints.iisc.ac.in/id/eprint/67645

Actions (login required)

View Item View Item