Binarization is of significant importance in document analysis systems. It is an essential first step, prior to further
stages such as Optical Character Recognition (OCR), document segmentation, or enhancement of readability of
the document after some restoration stages. Hence, proper evaluation of binarization methods to verify their
effectiveness is of great value to the document analysis community. In this work, we perform a detailed goal-oriented evaluation of image quality assessment of the 18 binarization methods that participated in the DIBCO
2011 competition using the 16 historical document test images used in the contest. We are interested in the
image quality assessment of the outputs generated by the different binarization algorithms as well as the OCR
performance, where possible. We compare our evaluation of the algorithms based on human perception of quality
to the DIBCO evaluation metrics. The results obtained provide an insight into the effectiveness of these methods
with respect to human perception of image quality as well as OCR performance.
Document layout analysis is of fundamental importance for document image understanding and information retrieval.
It requires the identification of blocks extracted from a document image via features extraction and block
classification. In this paper, we focus on the classification of the extracted blocks into five classes: text (machine
printed), handwriting, graphics, images, and noise. We propose a new set of features for efficient classifications
of these blocks. We present a comparative evaluation of three ensemble based classification algorithms (boosting,
bagging, and combined model trees) in addition to other known learning algorithms. Experimental results
are demonstrated for a set of 36503 zones extracted from 416 document images which were randomly selected
from the tobacco legacy document collection. The results obtained verify the robustness and effectiveness of
the proposed set of features in comparison to the commonly used Ocropus recognition features. When used in
conjunction with the Ocropus feature set, we further improve the performance of the block classification system
to obtain a classification accuracy of 99.21%.
In previous work , we proposed the application of the Expectation-Maximization (EM) algorithm in the binarization
of historical documents by defining a multi-resolution framework. In this work, we extend the multiresolution
framework to the Otsu algorithm for effective binarization of historical documents. We compare the
effectiveness of the EM based binarization technique to the Otsu thresholding algorithm on historical documents.
We demonstrate how the EM can be extended to perform an effective segmentation of historical documents by
taking into account multiple features beyond the intensity of the document image. Experimental results, analysis
and comparisons to known techniques are presented using the document image collection from the DIBCO 2009
contest.
Large degradations in document images impede their readability as well as substantially deteriorating the performance
of automated document processing systems. Image quality metrics have been defined to correlate with
OCR accuracy. However, this does not always correlate with human perception of image quality. When enhancing
document images with the goal of improving readability, it is important to understand human perception
of quality. The goal of this work is to evaluate human perception of degradation and correlate it to known
degradation parameters and existing image quality metrics. The information captured enables the learning and
estimation of human perception of document image quality.
In previous work we showed that shape descriptor features can be used in Look Up Table (LUT) classifiers to
learn patterns of degradation and correction in historical document images. The algorithm encodes the pixel
neighborhood information effectively using a variant of shape descriptor. However, the generation of the shape
descriptor features was approached in a heuristic manner. In this work, we propose a system of learning the
shape features from the training data set by using neural networks: Multilayer Perceptrons (MLP) for feature
extraction. Given that the MLP maybe restricted by a limited dataset, we apply a feature selection algorithm to
generalize, and thus improve, the feature set obtained from the MLP. We validate the effectiveness and efficiency
of the proposed approach via experimental results.
In previous work we showed that Look Up Table (LUT) classifiers can be trained to learn patterns of degradation
and correction in historical document images. The effectiveness of the classifiers is directly proportional to the
size of the pixel neighborhood it considers. However, the computational cost increases almost exponentially with
the neighborhood size. In this paper, we propose a novel algorithm that encodes the neighborhood information
efficiently using a shape descriptor. Using shape descriptor features, we are able to characterize the pixel
neighborhood of document images with much fewer bits and so obtain an efficient system with significantly
reduced computational cost. Experimental results demonstrate the effectiveness and efficiency of the proposed
approach.
The fast evolution of scanning and computing technologies have led to the creation of large collections of scanned paper documents. Examples of such collections include historical collections, legal depositories, medical archives, and business archives. Moreover, in many situations such as legal litigation and security investigations scanned collections are being used to facilitate systematic exploration of the data. It is almost always the case that
scanned documents suffer from some form of degradation. Large degradations make documents hard to read and substantially deteriorate the performance of automated document processing systems. Enhancement of degraded document images is normally performed assuming global degradation models. When the degradation is large,
global degradation models do not perform well. In contrast, we propose to estimate local degradation models and
use them in enhancing degraded document images. Using a semi-automated enhancement system we have labeled
a subset of the Frieder diaries collection.1 This labeled subset was then used to train an ensemble classifier. The
component classifiers are based on lookup tables (LUT) in conjunction with the approximated nearest neighbor algorithm. The resulting algorithm is highly effcient. Experimental evaluation results are provided using the Frieder diaries collection.1
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.