ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

OTCYMIST: Otsu-Canny minimal spanning tree for born-digital images

Kumar, Deepak and Ramakrishnan, AG (2012) OTCYMIST: Otsu-Canny minimal spanning tree for born-digital images. In: 2012 10th IAPR International Workshop on Document Analysis Systems (DAS), 27-29 March 2012, Gold Cost, QLD.

[img] PDF
IAPR_Int_Wor_Doc_Ana_Sys_389_2012.pdf - Published Version
Restricted to Registered users only

Download (400kB) | Request a copy
Official URL: http://dx.doi.org/

Abstract

Text segmentation and localization algorithms are proposed for the born-digital image dataset. Binarization and edge detection are separately carried out on the three colour planes of the image. Connected components (CC's) obtained from the binarized image are thresholded based on their area and aspect ratio. CC's which contain sufficient edge pixels are retained. A novel approach is presented, where the text components are represented as nodes of a graph. Nodes correspond to the centroids of the individual CC's. Long edges are broken from the minimum spanning tree of the graph. Pair wise height ratio is also used to remove likely non-text components. A new minimum spanning tree is created from the remaining nodes. Horizontal grouping is performed on the CC's to generate bounding boxes of text strings. Overlapping bounding boxes are removed using an overlap area threshold. Non-overlapping and minimally overlapping bounding boxes are used for text segmentation. Vertical splitting is applied to generate bounding boxes at the word level. The proposed method is applied on all the images of the test dataset and values of precision, recall and H-mean are obtained using different approaches.

Item Type: Conference Paper
Additional Information: Copyright of this article belongs to IEEE.
Keywords: Binarization; Edge Detection; Minimum spanning Tree; Text Segmentation; Text Localization
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Depositing User: Id for Latest eprints
Date Deposited: 02 Jul 2013 07:47
Last Modified: 02 Jul 2013 07:47
URI: http://eprints.iisc.ac.in/id/eprint/46544

Actions (login required)

View Item View Item