Jain, Abhilash and Rathna, GN (2017) VISUAL SPEECH RECOGNITION FOR ISOLATED DIGITS USING DISCRETE COSINE TRANSFORM AND LOCAL BINARY PATTERN FEATURES. In: 2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), NOV 14-16, 2017, Montreal, QC, Canada, pp. 368-372.
![]() |
PDF
Iee_Glo_Con_Sig_Inf_Pro_2017.pdf - Published Version Restricted to Registered users only Download (1MB) | Request a copy |
Abstract
Visual Speech Recognition (VSR) deals with the task of extracting speech information from visual cues from a person's face while speaking. Accurate lip segmentation and modeling are essential in any VSR algorithm for good feature extraction. However, lip modeling is a complicated task and is not very robust in natural conditions. This paper describes a novel technique for limited vocabulary visual-only speech recognition that does not use lip modeling. For visual feature extraction, Discrete Cosine Transform (DCT) and Local Binary Pattern (LBP) have been tested. An Error-Correcting Output Codes (ECOC) multi-class model using Support Vector Machine (SVM) binary learners is used for recognition and classification of words.
Item Type: | Conference Paper |
---|---|
Series.: | IEEE Global Conference on Signal and Information Processing |
Publisher: | IEEE |
Additional Information: | 5th IEEE Global Conference on Signal and Information Processing (GlobalSIP), Montreal, CANADA, NOV 14-16, 2017 |
Keywords: | visual speech recognition; local binary patterns; discrete cosine transform; feature extraction |
Department/Centre: | Division of Electrical Sciences > Electrical Engineering |
Date Deposited: | 15 Jan 2019 14:59 |
Last Modified: | 15 Jan 2019 14:59 |
URI: | http://eprints.iisc.ac.in/id/eprint/61270 |
Actions (login required)
![]() |
View Item |