ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Evaluation of document binarization using eigen value decomposition

Kumar, Deepak and Prasad, MN Anil and Ramakrishnan, AG (2013) Evaluation of document binarization using eigen value decomposition. In: 20th Conference on Document Recognition and Retrieval (DRR) held as part of the IS and T/SPIE Symposium on Electronic Imaging, FEB 05-07, 2013 , San Francisco, CA.

[img] PDF
Doc_rec_ret-8658.1_2013.pdf - Published Version
Restricted to Registered users only

Download (190kB) | Request a copy
Official URL: http://dx.doi.org/10.1117/12.2008502

Abstract

A necessary step for the recognition of scanned documents is binarization, which is essentially the segmentation of the document. In order to binarize a scanned document, we can find several algorithms in the literature. What is the best binarization result for a given document image? To answer this question, a user needs to check different binarization algorithms for suitability, since different algorithms may work better for different type of documents. Manually choosing the best from a set of binarized documents is time consuming. To automate the selection of the best segmented document, either we need to use ground-truth of the document or propose an evaluation metric. If ground-truth is available, then precision and recall can be used to choose the best binarized document. What is the case, when ground-truth is not available? Can we come up with a metric which evaluates these binarized documents? Hence, we propose a metric to evaluate binarized document images using eigen value decomposition. We have evaluated this measure on DIBCO and H-DIBCO datasets. The proposed method chooses the best binarized document that is close to the ground-truth of the document.

Item Type: Conference Proceedings
Series.: Proceedings of SPIE
Publisher: SPIE-INT SOC OPTICAL ENGINEERING
Additional Information: copyright for this article belongs to SPIE-INT SOC OPTICAL ENGINEERING San Francisco, CA,.
Keywords: binarization; evaluation; eigen value decomposition; threshold; degraded documents; document quality measure
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 03 Jan 2014 11:06
Last Modified: 03 Jan 2014 11:06
URI: http://eprints.iisc.ac.in/id/eprint/48006

Actions (login required)

View Item View Item