Reconhecimento de palavras manuscritas usando modelos de Markov

Reconhecimento de palavras manuscritas usando modelos de Markov

Koerich, Alessandro L.

IEEE Latin America Transactions 2004

Abstract : This paper presents a handwriting recognition system that deals with unconstrained handwriting and large vocabularies. The system is based on a segmentation-recognition paradigm where words are first loosely segmented into characters and the final segmentation is obtained during the recognition process, which is driven by a lexicon. Characters are modeled by multiple hidden Markov models (HMMs), which are concatenated to build up word models. The recognition algorithm breaks up the decoding of words into two levels: state level and character level. This enables the pre-computation of character likelihoods and their further use to decode all words in a lexicon, avoiding repeated computation of state sequences. A rejection mechanism is used to either accept or reject word hypotheses and to improve the reliability of the recognition system. Experimental results on a dataset of 4,674 handwritten words show that the proposed handwriting recognition system achieves, at 0% rejection level, recognition rates from 98.8% for a 10-word vocabulary to 68.6% for an 80,000-word vocabulary and recognition times from 10ms to 14.4s respectively. At 30% rejection level, recognition rates from 100% to 87% are achieved.