ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Creation of a Huge Annotated Database for Tamil and Kannada OHR

Nethravathi, B and Archana, CP and Shashikiran, K and Ramakrishnan, AG (2011) Creation of a Huge Annotated Database for Tamil and Kannada OHR. In: 2010 International Conference on Frontiers in Handwriting Recognition (ICFHR), 16-18 Nov. 2010, Kolkata.

[img] PDF
Creation_of_a_huge.pdf - Published Version
Restricted to Registered users only

Download (494kB) | Request a copy
Official URL: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumb...

Abstract

This paper describes the efforts at MILE lab, IISc, to create a 100,000-word database each in Kannada and Tamil for the design and development of Online Handwritten Recognition. It has been collected from over 600 users in order to capture the variations in writing style. We describe features of the scripts and how the number of symbols were reduced to be able to effectively train the data for recognition. The list of words include all the characters, Kannada and Indo-Arabic numerals, punctuations and other symbols. A semi-automated tool for the annotation of data from stroke to word level is used. It segments each word into stroke groups and also acts as a validation mechanism for segmentation. The tool displays the stroke, stroke groups and aksharas of a word and hence can be used to study the various styles of writing, delayed strokes and for assigning quality tags to the words. The tool is currently being used for annotating Tamil and Kannada data. The output is stored in a standard XML format.

Item Type: Conference Paper
Publisher: IEEE
Additional Information: Copyright 2010 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Keywords: Annotation;Online character database;OHR database;Tamil handwriting;Kannada handwriting;
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 27 Dec 2011 08:55
Last Modified: 27 Dec 2011 08:55
URI: http://eprints.iisc.ac.in/id/eprint/42912

Actions (login required)

View Item View Item