Not registered? - Request an account here
Provided to participants:
To be delivered by participants:
The ground truth for each image is provided in the PAGE (Page Analysis and Ground truth Elements) format. For a description of the relevant parts (for this competition) of the XML file structure please see the section "Page analysis and recognition results" below.
PAGE has been developed on a long working experience in creating, managing and using datasets, including the PRImA Layout Analysis Dataset and the large and significant historical document dataset of the EU-funded IMPACT project.
More details on the PAGE format can be found in the following paper:
S. Pletschacher, A. Antonacopoulos, "The PAGE (Page Analysis and Ground-Truth Elements) Format Framework", Proceedings of the 20th International Conference on Pattern Recognition (ICPR2010), Istanbul, Turkey, August 23-26, 2010, IEEE-CS Press, pp. 257-260. [further details]
And in the actual XML Schema:
http://schema.primaresearch.org/PAGE/gts/pagecontent/2018-07-15/pagecontent.xsd
The format provides for the representation of several different region types, which may be subject to different processing in recognition systems. The most important types of region are text, image, line drawing, graphic, table, chart, map, separator, maths, and noise. In the ground truth, the highest-level textual regions correspond to paragraphs (a conscious choice as a paragraph is also a complete logical entity, as opposed to columns of text for instance).
For each region there is a description of its outline in the form of a closely fitting polygon. Such a representation enables a very accurate and efficient geometric description, especially for complex-shaped regions. Text regions may also contain Unicode text content.
A simple example XML is described in this document
Authors of methods should submit the following by e-mail to the organisers:
The results must be stored in the PAGE format (same format as the ground-truth provided). Evaluation will be based on detected regions (location, type and subtype) and detected text. Further metadata will not be taken into account.
Open source tools for exporting in the PAGE format are available from the PRImA Tools website.
Alternatively you can produce PAGE files using your own XML library, following the PAGE Schema.
Aletheia, a PAGE viewer and editor is also available for download so you can preview your results and check for validity of your produced XML files.
Filenames of submitted PAGE files should match the name of the original image.
Some documents will have image regions with overlaid text (see screenshots). Such nested regions should be contained as child XML objects in the surrounding parent region (e.g. table, chart, graphic, map, image). Only readable and front-facing text should be included.
Some pages in the evaluation set are full-page inverse video (light text on dark background). You can create your own training data by inverting the images of the example set (as shown above).
The following are examples of representative images from the variety of situations existing within the evaluation dataset.