High performance data publication with real-time privacy protection
Build statistical outputs directly from microdata with automated privacy protections and integrated metadata to improve reproducibility, eliminate errors and enable innovative publication tools.
Statistical disclosure control at speed and scale
Unlock the value of your data by publishing it faster
Significantly reduce the delay between data collection and publication with fast, automated privacy protection and tabulation.
Increase productivity through automation and repeatability
Free up your statisticians’ time by automating tasks with our API and Python tools and speeding up testing of outputs and privacy methods.
Keep control of your own data and privacy approaches
Use our flexible configuration options and Disclosure Rules Language to fully control privacy techniques.
How it works
Manage your data and privacy methods your own way
Cantabular is cross-platform, self-hosted software: we give you our software, so you can keep control of your own data.
You can deploy Cantabular internally and use our productivity tools and APIs, or make it available on the web and connect to your own or our publication systems and apps.
Our novel Disclosure Rules Language can be used to create automated disclosure checks that fit the structure and content of your data.
Protect your data and unlock its value
Cantabular provides an array of different data privacy techniques, implemented in a system designed to be secure at every level.
Use our configurable implementation of the cell-key method to add random noise to your tabular outputs, for rounding small counts or for primary suppression.
Our Disclosure Rules Language allows you to implement custom query and output table checks to automatically block disclosive queries and tables.
Our software team can also work with you to implement other privacy algorithms that you need and with your IT teams to securely deploy our software.
- Cell-key perturbation
- Zero-perturbation with structural zeros
- Primary suppression
- Automated query checks
- Customisable output table checks
Get your data back in seconds, not minutes
Cantabular has been built to be fast. We use our own data format and query algorithms to ensure lightning performance.
This speed opens new possibilities: real-time disclosure control in a flexible dissemination system, or pre-computation of millions of tables in a realistic timeframe.
Cantabular is designed to operate behind an auto-scaling load balancer to enable it to operate at scale as well as speed and handle hundreds of requests per second.
Note: Query timings shown are indicative numbers for a single query of age by sex by low-level geography on 60 million rows of data. Cantabular timing includes disclosure control methods; database timings are optimistic because they only measure time to make tables without disclosure control applied.
Integrate Cantabular into your production chain for consistent, repeatable results
Cantabular is designed to operate as part of a reproducible analytical pipeline to allow reliable analysis of data, free from copy-paste errors.
Our small, secure and fast API can be connected to your own systems and pipelines to allow automated verification and production of outputs.
An open source Python API wrapper, and our OpenAPI specification, let your data analysts easily work with our API to speed up analysis through code editors or interactive notebooks like Jupyter.
The Office for National Statistics in the UK has selected Cantabular and its real-time privacy protection capability to help disseminate Census 2021 data.
We operate a fair pricing model which depends on the volume of data being processed per annum, together with a minimum licence fee. There is no charge for multiple servers or for user licenses.
Please get in touch if you would like to discuss this further with us.
Request a demo
Enter your email address below and we’ll get in touch to arrange a chat.
You can also check out our live demo using the 1911 Irish census.