Kumar, PJ and Yarra, C and Ghosh, PK (2021) DNN based phrase boundary detection using knowledge-based features and feature representations from CNN. In: 27th National Conference on Communications, 27-30 jul 2021, Kanpur.
PDF
IEEE_NCC_2021.pdf - Published Version Restricted to Registered users only Download (1MB) | Request a copy |
Abstract
Automatic phrase boundary detection could be useful in applications, including computer-assisted pronunciation tutoring, spoken language understanding, and automatic speech recognition. In this work, we consider the problem of phrase boundary detection on English utterances spoken by native American speakers. Most of the existing works on boundary detection use either knowledge-based features or representations learnt from a convolutional neural network (CNN) based architecture, considering word segments. However, we hypothesize that combining knowledge-based features and learned representations could improve the boundary detection task's performance. For this, we consider a fusion-based model considering deep neural network (DNN) and CNN, where CNNs are used for learning representations and DNN is used to combine knowledge-based features and learned representations. Further, unlike existing data-driven methods, we consider two CNNs for learning representation, one for word segments and another for word-final syllable segments. Experiments on Boston University radio news and Switchboard corpora show the benefit of the proposed fusion-based approach compared to a baseline using knowledge-based features only and another baseline using feature representations from CNN only. © 2021 IEEE.
Item Type: | Conference Paper |
---|---|
Publication: | 2021 National Conference on Communications, NCC 2021 |
Publisher: | Institute of Electrical and Electronics Engineers Inc. |
Additional Information: | The copyright for this article belongs to Institute of Electrical and Electronics Engineers Inc. |
Keywords: | Computer aided instruction; Convolutional neural networks; Face recognition; Feature extraction; Human computer interaction; Knowledge based systems; Speech recognition, Boundary detection; Computer assisted; Computer-assisted pronunciation tutoring; Convolutional neural network; Convolutional neural network based representation learning; Feature representation; Knowledge based; Network-based; Phrase boundary detections; Spoken language understanding, Deep neural networks |
Department/Centre: | Division of Electrical Sciences > Electrical Engineering |
Date Deposited: | 07 Dec 2021 10:22 |
Last Modified: | 07 Dec 2021 10:22 |
URI: | http://eprints.iisc.ac.in/id/eprint/70379 |
Actions (login required)
View Item |