Not registered? - Request an account here
Platform independent libraries for Java and C++ to create valid layout descriptions in PAGE XML format. The libraries can be easily integrated in other software projects such as page segmentation methods for ICDAR competitions.
The Java PAGE Libraries are a comprehensive set of Java classes that can be easily integrated to produce valid PAGE instances from within your tool. The download contains a basic use example.
The C++ PAGE Exporter is a basic C++ class that can be easily integrated to produce valid PAGE instances from within your tool.
Alternatively you can produce PAGE files using your own XML library, following the PAGE Schema which can be found at: http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15/pagecontent.xsd.
A survey of OCR evaluation tools and metrics
In The 6th International Workshop on Historical Document Imaging and Processing (HIP '21). Association for Computing Machinery, New York, NY, USA, 13–18.
The PAGE (Page Analysis and Ground-Truth Elements) Format Framework
Proceedings of the 20th International Conference on Pattern Recognition (ICPR2010), Istanbul, Turkey, August 23-26, 2010, IEEE-CS Press, pp. 257-260