Deep learning approaches for image retrieval and pattern spotting in ancient documents

Deep learning approaches for image retrieval and pattern spotting in ancient documents

Wiggers, Kelly Lais and de Souza Britto Junior, Alceu and Koerich, Alessandro Lameiras and Heutte, Laurent and de Oliveira, Luiz Eduardo Soares

arXiv 2019

Abstract : This paper describes two approaches for content-based image retrieval and pattern spotting in document images using deep learning. The first approach uses a pre-trained CNN model to cope with the lack of training data, which is fine-tuned to achieve a compact yet discriminant representation of queries and image candidates. The second approach uses a Siamese Convolution Neural Network trained on a previously prepared subset of image pairs from the ImageNet dataset to provide the similarity-based feature maps. In both methods, the learned representation scheme considers feature maps of different sizes which are evaluated in terms of retrieval performance. A robust experimental protocol using two public datasets (Tobacoo-800 and DocExplore) has shown that the proposed methods compare favorably against state-of-the-art document image retrieval and pattern spotting methods.