Not registered? - Request an account here

Two Approaches for Text Segmentation in Web Images

D. Karatzas, A. Antonacopoulos

Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR2003), Edinburgh, UK, August 2003, pp. 131-136

Abstract

There is a significant need to recognise the text in images on web pages, both for effective indexing and for presentation by non-visual means (e.g., audio). This paper presents and compares two novel methods for the segmentation of characters for subsequent extraction and recognition. The novelty of both approaches is the combination of (different in each case) topological features of characters with an anthropocentric perspective of colour perception in preference to RGB space analysis. Both approaches enable the extraction of text in complex situations such as in the presence of varying colour and texture (characters and background).

Citation

D. Karatzas, A. Antonacopoulos , "Two Approaches for Text Segmentation in Web Images", Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR2003), Edinburgh, UK, August 2003, pp. 131-136

DOI

10.1109/ICDAR.2003.1227646

Full Paper

Download PDF