4 minute read
Explaining the impact of differential privacy with real-time confidence interval tabulation
On 4th October 2024, Mike Thompson gave a presentation at UNECE Census Week on the impact of differential privacy with real-time confidence interval tabulation. Following the event, we recorded the talk again separately in order to share it more widely.
UNECE Census Week 2024
UNECE Census Week focused on the considerations and recommendations for the 2030 round of censuses. Senior statistical representatives from countries across Europe, Greater Eurasia and North America presented a wide range of topics in order to share experience, forge cooperative relationships and acknowledge similarities and differences in national statistical systems and practices.
Mike presented the challenges of calculating and communicating uncertainty in official statistics and how we modified our software Cantabular to help US Census users explore the 2010 and 2020 US Census Privacy Protected Microdata Files (PPMFs). He outlined the US Census Bureau’s openness and transparency in the adoption of Differential Privacy as the preferred approach to protecting confidential data. He explained the evolution of the Top Down Algorithm. Finally he highlighted the importance of giving serious consideration to building a dissemination strategy early in the process and when privacy protections are being decided.
The main themes were
The importance of communicating uncertainty as complexity increases
Innovation is inherently uncertain: Iterate, test and communicate
Privacy and publication are intertwined: Make a plan now!
Demonstration: 2010 Census Data Confidence Interval Tabulator
Mike gave a demonstration of the Cantabular US census 2010 confidence interval tabulator which allows users to generate tables on demand with the low and high confidence intervals associated within each row of data. The calculations are performed based on the published PPMF and the 25 replicates released by the US Census Bureau.
Try it! Build 2010 census cross-tabulations with confidence intervals
Explore the US Census 2020 with Cantabular
We also created a tabulator for the recently released 2020 which allows users to select data through a friendly user interface or a mapping feature which visualises census geography down to tract level. The system is designed to be secure and performant and tables generated on demand are returned to users in less than a second.
Try it! Build 2020 census cross-tabulations and area profiles
Area Profiles
We added a feature called Area Profiles which allows users to easily explore their population demographics down to county level.