Our team here at Sensible Code have been busy for a few years working on an innovative privacy preserving technology called Cantabular. Cantabular uses highly performant implementations of disclosure control algorithms to protect data in real time as a user or researcher makes a query.
The UK based Office for National Statistics has selected Cantabular to allow flexible dissemination of confidential Census 2021 data.
Aine McGuire, Commercial Director, Sensible Code Company:
We’re delighted to be working with the ONS, given its international reputation as the gold standard in statistical practice. Our technology will transform the way Census 2021 data is disseminated and deliver higher value to the economy, through better policy, better business decisions and valuable research. The software applies robust statistical disclosure control techniques in real time. The ONS is able to compute millions of tables of data at high speed whilst protecting anonymity and to ensure data are non-disclosive.
It’s a journey for the ONS
The outputs team at the ONS have been in consultation with users since 2017. From day one the statistical disclosure control professionals within the ONS have been engaged in the process. To give some sense of this, we’ve included references to some public events and consultations.
United Nations Economic and Social Council, Economic Commission for Europe Conference of European Statisticians (ECE/CES/GE.41/2018/18)
Abstract: The Office for National Statistics has been working to ensure 2021 Census outputs are more flexible, timely and accessible compared to the 2011 Census. This document outlines our strategic vision for the dissemination of 2021 Census outputs. We also set out the approach we have taken to gather feedback from a spectrum of users on our design and content and how we are planning to incorporate this feedback into our future research. In early 2018, we held a public consultation to outline our vision for the content and design of 2021 Census outputs.
This included our plans to disseminate the majority of census data through a single point of access via the ONS website using a flexible dissemination system. This will be enabled through an innovative combination of statistical disclosure control methods, which include targeted record swapping, and an automated layer of light-touch perturbation and final disclosure checks. We also set out our plans for the design and dissemination of specialist products, including microdata samples and origin-destination (flow) data products.
Applying Cell-Key Perturbation to 2021 Census Outputs
By Iain Dove, Stephanie Blanchard, and Keith Spicer
“In preparation for 2021, the disclosure control team is investigating several methods of protection, including use of targeted record swapping plus cell key perturbation, illustrated below.
This would specifically protect against disclosure by differencing and allow user-defined outputs to be distributed through an online table builder. The protection and checking will have been applied before the tables are made, so anything available to be built will not need to be checked, and tables will not need to be re-designed. Most protection would still come from targeted record swapping as before, with cell perturbation also protecting against differencing.”
…and it’s a journey for Sensible Code too
We’ve been working with the ONS since 2016. This is an enterprise-scale digital transformation and is driven by user needs. Census is a critically important dataset for the ONS and information assurance is paramount.
We’ve included some earlier posts that explain the approach to the product design. The product used to be called TableBuilder.