Not registered? - Request an account here

Performance Analysis of Document Image Analysis Subsystems

A. Antonacopoulos, A. Brough

Digest of IEE Colloquium on Document Image Processing and Multimedia (DIPM), Manchester, UK, 1999, pp. 15/1-15/4

Abstract

As a number of different document image analysis (DIA) algorithms start to mature, there is a significant need for objective evaluation and analysis of their performance. Significant activity has so far concentrated on evaluating OCR results, in which case, the nature of the ground truth data (ASCII characters) lends itself to elaborate analysis using string matching theory to calculate errors and associated costs. Consequently, it has already been possible to automate OCR evaluation using large-scale test-databases. Large-scale testing and evaluation is essential not only for OCR but for each of the subsystems involved in DIA also. This paper presents a new performance analysis framework that focuses on subsystems comprising the layout analysis stage of DIA. The most significant subsystems in this stage are page segmentation and classification. A critical overview of previous approaches to performance analysis for these subsystems is presented. Subsequently, the concept and current state of work towards a new framework for performance analysis developed at the University of Liverpool are presented.

Citation

A. Antonacopoulos, A. Brough , "Performance Analysis of Document Image Analysis Subsystems", Digest of IEE Colloquium on Document Image Processing and Multimedia (DIPM), Manchester, UK, 1999, pp. 15/1-15/4

DOI

10.1049/ic:19990215