Not registered? - Request an account here
In the i2010 vision of a European Digital Library, the EU launched an ambitious plan for large scale digitisation projects transforming Europe’s printed heritage into digitally available resources. The aim of fully integrating intellectual content into the modern information and communication technologies environment can only be achieved by full-text digitisation: transforming digital images of scanned books into electronic text.
Over the last 2-3 years mass-digitisation has become one of the most prominent issues in the library world. Today, a number of advanced libraries in Europe are scanning millions of pages each year and large scale-digitisation is a matter of fact, not a vision any more. However, these efforts can tackle only a fraction of the total heritage available in cultural memory organisations. The digitised material is becoming available too slowly and in too small quantities from too few sources, for three reasons.
The IMPACT project will remove many of these barriers. The project will push innovation in OCR technology and language technology for historical document processing and retrieval, and share expertise to build capacity in digitisation across Europe. During the project a Centre of Competence will be set up in order to provide a central service entry point for all libraries, archives and museums involved in the digitisation of textual material.
The consortium brings together twenty-six national and regional libraries, research institutions and commercial suppliers who will share their know-how and best practices, develop innovative tools to enhance the capabilities of OCR engines and the accessibility of digitised text and lay down the foundations for the mass-digitisation programmes that will take place over the next decade.
Navigating the Storm: IMPACT, eMOP, and Agile Steering Standards
Digital Scholarship in the Humanities, 2015.
The IMPACT Dataset of Historical Document Images
Proceedings of the 2013 Workshop on Historical Document Imaging and Processing (HIP2013), Washington DC, USA, August 2013, pp. 123-130
A robust hybrid approach for text line segmentation in historical documents
Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), Tsukuba, Japan, November 11-15, 2012, IEEE-CS Press, pp. 335-338
Restoration of Arbitrarily Warped Historical Document Images Using Flow Lines
Proceedings of the 11th International Conference on Document Analysis and Recognition (ICDAR2011), Beijing, China, September 2011, pp. 905-909
Scenario Driven In-Depth Performance Evaluation of Document Layout Analysis Methods
Proceedings of the 11th International Conference on Document Analysis and Recognition (ICDAR2011), Beijing, China, September 2011, pp. 1404-1408
Aletheia - An Advanced Document Layout and Text Ground-Truthing System for Production Environments
Proceedings of the 11th International Conference on Document Analysis and Recognition (ICDAR2011), Beijing, China, September 2011, pp. 48-52
Historical Document Layout Analysis Competition
Proceedings of the 11th International Conference on Document Analysis and Recognition (ICDAR2011), Beijing, China, September 2011, pp. 1516-1520
Grid-Based Modelling and Correction of Arbitrarily Warped Historical Document Images for Large-Scale Digitisation
Proceedings of the 2011 Workshop on Historical Document Imaging and Processing (HIP2011), Beijing, China, September 2011, pp. 106-111
The PAGE (Page Analysis and Ground-Truth Elements) Format Framework
Proceedings of the 20th International Conference on Pattern Recognition (ICPR2010), Istanbul, Turkey, August 23-26, 2010, IEEE-CS Press, pp. 257-260
A New Framework for Recognition of Heavily Degraded Characters in Historical Typewritten Documents Based on Semi-Supervised Clustering
Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR2009), Barcelona, Spain, July 2009, pp. 506-510
Word-Based Adaptive OCR for Historical Books
Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR2009), Barcelona, Spain, July 2009, pp. 501-505
A Realistic Dataset for Performance Evaluation of Document Layout Analysis
Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR2009), Barcelona, Spain, July 2009, pp. 296-300
ICDAR2009 Page Segmentation Competition
Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR2009), Barcelona, Spain, July 2009, pp. 1370-1374
A Geometric Approach for Accurate and Efficient Performance Evaluation of Layout Analysis Methods
Proceedings of the 19th International Conference on Pattern Recognition (ICPR2008), Tampa, Florida, USA, December 7-11, 2008, IEEE-CS Press