ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

NESP: Nonlinear enhancement and selection of plane for optimal segmentation and recognition of scene word images

Kumar, Deepak and Prasad, MN Anil and Ramakrishnan, AG (2013) NESP: Nonlinear enhancement and selection of plane for optimal segmentation and recognition of scene word images. In: 20th Conference on Document Recognition and Retrieval (DRR), FEB 05-07, 2013 , San Francisco, CA, USA.

[img] PDF
Doc_Rec_Ret-8658_2013.pdf - Published Version
Restricted to Registered users only

Download (256kB) | Request a copy
Official URL: http://dx.doi.org/10.1117/12.2008519


In this paper, we report a breakthrough result on the difficult task of segmentation and recognition of coloured text from the word image dataset of ICDAR robust reading competition challenge 2: reading text in scene images. We split the word image into individual colour, gray and lightness planes and enhance the contrast of each of these planes independently by a power-law transform. The discrimination factor of each plane is computed as the maximum between-class variance used in Otsu thresholding. The plane that has maximum discrimination factor is selected for segmentation. The trial version of Omnipage OCR is then used on the binarized words for recognition. Our recognition results on ICDAR 2011 and ICDAR 2003 word datasets are compared with those reported in the literature. As baseline, the images binarized by simple global and local thresholding techniques were also recognized. The word recognition rate obtained by our non-linear enhancement and selection of plance method is 72.8% and 66.2% for ICDAR 2011 and 2003 word datasets, respectively. We have created ground-truth for each image at the pixel level to benchmark these datasets using a toolkit developed by us. The recognition rate of benchmarked images is 86.7% and 83.9% for ICDAR 2011 and 2003 datasets, respectively.

Item Type: Conference Proceedings
Series.: Proceedings of SPIE
Additional Information: copyright for this article belongs to SPIE-INT SOC OPTICAL ENGINEERING, USA
Keywords: nonlinear enhancement; power-law transform; text polarity inversion; binarization; evaluation; threshold; recognition; normality test
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 03 Jan 2014 11:06
Last Modified: 03 Jan 2014 11:06
URI: http://eprints.iisc.ac.in/id/eprint/48005

Actions (login required)

View Item View Item