Paper
16 January 2006 Spotting words in handwritten Arabic documents
Author Affiliations +
Proceedings Volume 6067, Document Recognition and Retrieval XIII; 606702 (2006) https://doi.org/10.1117/12.643107
Event: Electronic Imaging 2006, 2006, San Jose, California, United States
Abstract
The design and performance of a system for spotting handwritten Arabic words in scanned document images is presented. Three main components of the system are a word segmenter, a shape based matcher for words and a search interface. The user types in a query in English within a search window, the system finds the equivalent Arabic word, e.g., by dictionary look-up, locates word images in an indexed (segmented) set of documents. A two-step approach is employed in performing the search: (1) prototype selection: the query is used to obtain a set of handwritten samples of that word from a known set of writers (these are the prototypes), and (2) word matching: the prototypes are used to spot each occurrence of those words in the indexed document database. A ranking is performed on the entire set of test word images-- where the ranking criterion is a similarity score between each prototype word and the candidate words based on global word shape features. A database of 20,000 word images contained in 100 scanned handwritten Arabic documents written by 10 different writers was used to study retrieval performance. Using five writers for providing prototypes and the other five for testing, using manually segmented documents, 55% precision is obtained at 50% recall. Performance increases as more writers are used for training.
© (2006) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Sargur Srihari, Harish Srinivasan, Pavithra Babu, and Chetan Bhole "Spotting words in handwritten Arabic documents", Proc. SPIE 6067, Document Recognition and Retrieval XIII, 606702 (16 January 2006); https://doi.org/10.1117/12.643107
Lens.org Logo
CITATIONS
Cited by 23 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image segmentation

Prototyping

Databases

Binary data

Human-machine interfaces

Image retrieval

Neural networks

RELATED CONTENT

Shape analysis for image retrieval
Proceedings of SPIE (April 01 1994)
Peano key rediscovery for content-based retrieval of images
Proceedings of SPIE (October 06 1997)
Evaluation of shape correspondence using ordinal measures
Proceedings of SPIE (December 19 2001)
Unified approach for document segmentation
Proceedings of SPIE (October 21 2004)

Back to Top