Urala, Bhargava K and Ramakrishnan, AG and Mohamed, Sahil (2014) Recognition of open vocabulary, online handwritten pages in Tamil script. In: International Conference on Signal Processing and Communications (SPCOM), JUL 22-25, 2014, Banaglore, INDIA.
PDF
2014_Int_Con_Sig_Pro_Com_2014.pdf - Published Version Restricted to Registered users only Download (514kB) | Request a copy |
Abstract
In this work, we describe a system, which recognises open vocabulary, isolated, online handwritten Tamil words and extend it to recognize a paragraph of writing. We explain in detail each step involved in the process: segmentation, preprocessing, feature extraction, classification and bigram-based post-processing. On our database of 45,000 handwritten words obtained through tablet PC, we have obtained symbol level accuracy of 78.5% and 85.3% without and with the usage of post-processing using symbol level language models, respectively. Word level accuracies for the same are 40.1% and 59.6%. A line and word level segmentation strategy is proposed, which gives promising results of 100% line segmentation and 98.1% word segmentation accuracies on our initial trials of 40 handwritten paragraphs. The two modules have been combined to obtain a full-fledged page recognition system for online handwritten Tamil data. To the knowledge of the authors, this is the first ever attempt on recognition of open vocabulary, online handwritten paragraphs in any Indian language.
Item Type: | Conference Proceedings |
---|---|
Publisher: | IEEE |
Additional Information: | Copy right for this article belongs to the IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA |
Department/Centre: | Division of Electrical Sciences > Electrical Communication Engineering Division of Electrical Sciences > Electrical Engineering |
Date Deposited: | 30 Dec 2015 06:08 |
Last Modified: | 30 Dec 2015 06:08 |
URI: | http://eprints.iisc.ac.in/id/eprint/52983 |
Actions (login required)
View Item |