ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Lip reading using simple dynamic features and a novel ROI for feature extraction

Jain, A and Rathna, GN (2018) Lip reading using simple dynamic features and a novel ROI for feature extraction. In: 2018 International Conference on Signal Processing and Machine Learning, SPML 2018, 28 - 30 November 2018, Shanghai; China, pp. 73-77.

[img] PDF
Int_Con_Sig_Pro_Mac_Lea_2018.pdf - Published Version
Restricted to Registered users only

Download (724kB) | Request a copy
Official URL: https://doi.org/10.1145/3297067.3297083


Deaf or hard-of-hearing people mostly rely on lip-reading to understand speech. They demonstrate the ability of humans to understand speech from visual cues only. Automatic lip reading systems work in a similar fashion - by obtaining speech or text from just the visual information, like a video of a person's face. In this paper, an automatic lip reading system for spoken digit recognition is presented. The system uses simple dynamic features by creating difference images between consecutive frames of the video input. Using this technique, word recognition rates of 83.79 and 65.58 are achieved in speaker-dependent and speaker-independent testing scenarios, respectively. A novel, extended region-of-interest (ROI) which includes lower jaw and neck region is also introduced. Most lip-reading algorithms use only the mouth/lip region for relevant feature extraction. Over simple mouth as the ROI, the proposed ROI improves the performance by 4 in speaker-dependent tests and by 11 in speaker-independent tests. © 2018 Association for Computing Machinery.

Item Type: Conference Proceedings
Publication: ACM International Conference Proceeding Series
Publisher: Association for Computing Machinery
Additional Information: Copyright for this article belongs to Association for Computing Machinery.
Keywords: Audition; Extraction; Feature extraction; Image segmentation; Machine learning, Difference images; Digit recognition; Lip reading; Region of interest; Speaker dependents; Speaker independents; Visual information; Visual speech recognition, Speech recognition
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 29 Apr 2019 07:49
Last Modified: 29 Apr 2019 07:49
URI: http://eprints.iisc.ac.in/id/eprint/62117

Actions (login required)

View Item View Item