Faculty Recruiting Make a Gift

Improving Text Recognition in Images of Natural Scenes

21 Nov
Thursday, 11/21/2013 9:30am to 11:30am
Ph.D. Thesis Defense

Jacqueline Feild

Computer Science Building, Room 151

The area of scene text recognition focuses on the problem of recognizing arbitrary text in images of natural scenes. Examples of scene text include street signs, business signs, grocery item labels, and license plates. With the increased use of smartphones and digital cameras, the ability to accurately recognize text in images is becoming increasingly useful and many people will benefit from advances in this area.

The goal of this thesis is to develop methods for improving scene text recognition. We do this by incorporating new types of information into models and by exploring how to compose simple components into highly effective systems. We focus on three areas of scene text recognition, each with a decreasing number of prior assumptions. First, we introduce two techniques for character recognition, where word and character bounding boxes are assumed. We describe a character recognition system that incorporates similarity information in a novel way and a new language model that models syllables in a word to produce word labels that can be pronounced in English. Next we look at word recognition, where only word bounding boxes are assumed. We develop a new technique for segmenting text for these images called bilateral regression segmentation, and we introduce an open-vocabulary word recognition system that uses a very large web-based lexicon to achieve state of the art recognition performance. Lastly, we remove the assumption that words have been located and describe and end-to-end system that detects and recognizes text in any natural scene image.  

Advisor: Erik Learned-Miller