Not registered? - Request an account here

www.primaresearch.org

Census 1961

Census 1961

Introduction

A series of projects, funded by the Office for National Statistics, was conducted to reclaim the data of the 1961 Census for England and Wales. The published reports are currently only available as image files created from microfilm copies from original computer printouts and reports in book-form.

Extraction of Small Area Statistics (2017-2019)

This project aims at transforming 1961 Census microfilm data tables into a comprehensive dataset structured in such a way to allow researchers to make further analyses. We’re looking at a range of data at different geographical levels, such as Districts, Wards, and Enumeration Districts.

Our workflow also includes a crowdsourcing component on Zooniverse to transcribe text that was not recognised correctly by the OCR pipeline.

Feasibility Study (2015-2017)

The Census 1961 Feasibility Study (concluded in 2017) was conducted to ascertain whether the complete 1961 Census data collection can be digitised and the information extracted and made available online in a highly versatile form similar to the newer Censuses.

The study was conducted in two parts by the authors in cooperation with the Office for National Statistics (ONS) from September 2015 to March 2017. The feasibility was tested by designing a digitisation pipeline, applying state-of-the-art page recognition systems, importing extracted fields into a database, applying sophisticated post-processing and quality assurance techniques and evaluating the results. The main questions to be answered were: What is the best way of digitising the material to maximise the quality of the output and is the quality high enough to satisfy the requirements of a trustworthy Census 1961 database with public access?

A prototype of a fully-functional pipeline was developed, including: image preprocessing, page analysis and recognition, post-processing, and data export. Each individual part of the pipeline was evaluated individually by testing a range of different analysis and recognition approaches on a representative data sample. Well-established performance evaluation metrics were used to precisely measure the impact of variations in the workflow on different types of data (image quality, page content etc.). In addition, the accuracy of the extracted tabular data was evaluated using model-intrinsic rules such as sums of values along table columns and/or rows and across different levels of geography.

Data

A dataset with Census 1961 images can be found here

Ground truth for a small number of pages is available here


Related Publications

Crowdsourcing Historical Tabular Data – 1961 Census of England and Wales

C. Clausner, J. Hayes, A. Antonacopoulos

Proceedings of the 2019 Workshop on Historical Document Imaging and Processing (HIP2019), Sydney, Australia, September 2019, pp. 42-47

Details »  Download PDF 


Creating a Complete Workflow for Digitising Historical Census Documents: Considerations and Evaluation

C. Clausner, J. Hayes, A. Antonacopoulos, S. Pletschacher

Proceedings of the 2017 Workshop on Historical Document Imaging and Processing (HIP2017), Kyoto, Japan, November 2017, pp. 83-88

Details »  Download PDF 


Unearthing the Recent Past: Digitising and Understanding Statistical Information from Census Tables

C. Clausner, J. Hayes, A. Antonacopoulos, S. Pletschacher

Proceedings of Second International Conference on Digital Access to Textual Cultural Heritage (DATeCH 2017), Goettingen, Germany, 01 - 02 June 2017

Details »  Download PDF