Not registered? - Request an account here
(22/01/2015)
PRImA is organising two ICDAR2015 competitions. The ICDAR2015 Competitions follow the successful running of all previous ICDAR Page Segmentation competitions (2001, 2003, 2005, 2007, 2009, 2011, 2013 Books and 2013 Newspapers). The proposed competitions will build upon the previous challenges, adding a new class of problems. As in 2013, the competitions will cover layout analysis as well as text recognition, in an end-to-end scenario. The motivation of both competitions is to evaluate existing approaches using a realistic dataset and an objective performance analysis system.
HBR2015 follows the successful running of all previous ICDAR Page Segmentation competitions, and focuses on historical books. This is the third competition run by PRImA with this focus. It presents challenges for page segmentation, region classification, and text recognition in an end-to-end scenario. The dataset used contains scanned pages from a wide range of historical books with a variety of layouts and conditions.
Historical books are of significant interest to Digital Humanities researchers and represent a large proportion of libraries’ holdings, and therefore, continue to be the focus of large-scale digitisation projects. A number of artefacts frequently manifest themselves in scans of historical books, hindering layout analysis and text recognition.
Access the competition website »
RDCL2015 follows the successful running of all previous ICDAR Page Segmentation competitions, focuses on contemporary documents with complex layouts. This is the sixth competition run by PRImA with this focus. It presents challenges for page segmentation, region classification, and text recognition in an end-to-end scenario. The dataset used contains scanned pages from contemporary magazines and technical articles.
Although this competition focuses on scanned pages of printed documents, the outcome is also applicable to the recognition of digital documents (e.g. PDF) with complex layouts. The page scans, comprising the dataset, are of good quality and have little to none distortions, noise and other artefacts.