An OCR free method for word spotting in printed documents: The evaluation of different feature sets

An OCR free method for word spotting in printed documents: The evaluation of different feature sets

Rios, Israel and de Souza Britto, Alceu and Koerich, Alessandro Lameiras and Oliveira, Luis Eduardo Soares

Journal of Universal Computer Science 2011

Abstract : An OCR free word spotting method is developed and evaluated under a strong experimental protocol. Different feature sets are evaluated under the same experimental conditions. In addition, a tuning process in the document segmentation step is proposed which provides a significant reduction in terms of processing time. For this purpose, a complete OCRfree method for word spotting in printed documents was implemented, and a document database containing document images and their corresponding ground truth text files was created. A strong experimental protocol based on 800 document images allows us to compare the results of the three feature sets used to represent the word image. © J.UCS.