Jain, Abhilash and Rathna, GN (2018) Visual speech recognition for isolated digits using discrete cosine transform and local binary pattern features. In: 5th IEEE Global Conference on Signal and Information Processing, GlobalSIP 2017, 14-16 November 2017, Montreal, pp. 368-372.
PDF
IEEE-GlobalSIP-2017_2018_368-37_2018 .pdf - Published Version Restricted to Registered users only Download (1MB) | Request a copy |
Abstract
Visual Speech Recognition (VSR) deals with the task of extracting speech information from visual cues from a person's face while speaking. Accurate lip segmentation and modeling are essential in any VSR algorithm for good feature extraction. However, lip modeling is a complicated task and is not very robust in natural conditions. This paper describes a novel technique for limited vocabulary visual-only speech recognition that does not use lip modeling. For visual feature extraction, Discrete Cosine Transform (DCT) and Local Binary Pattern (LBP) have been tested. An Error-Correcting Output Codes (ECOC) multi-class model using Support Vector Machine (SVM) binary learners is used for recognition and classification of words.
Item Type: | Conference Paper |
---|---|
Publisher: | Institute of Electrical and Electronics Engineers Inc. |
Additional Information: | The copyright for this article belongs to the Institute of Electrical and Electronics Engineers Inc. |
Keywords: | Discrete cosine transform; Feature extraction; Local binary patterns; Visual speech recognition |
Department/Centre: | Division of Electrical Sciences > Electrical Engineering |
Date Deposited: | 07 Jun 2022 10:27 |
Last Modified: | 07 Jun 2022 10:27 |
URI: | https://eprints.iisc.ac.in/id/eprint/73206 |
Actions (login required)
View Item |