Real-time privacy protection for confidential datasets

Automate privacy protection and production of tabular data to ensure repeatability of outputs and enable flexible dissemination with our powerful API, Python tools and user interfaces.

Why Cantabular?

Statistical disclosure control at speed and scale

Unlock the value of your data by publishing it faster

Significantly reduce the delay between data collection and publication with fast, automated privacy protection and tabulation.

Increase productivity through automation and repeatability

Free up your statisticians’ time by automating tasks with our API and Python tools and speeding up testing of outputs and privacy methods.

Keep control of your own data and privacy approaches

Use our flexible configuration options and Disclosure Rules Language to fully control privacy techniques.

How it works

Flexibility and control

Manage your data and privacy methods your own way

Cantabular is cross-platform, self-hosted software: we give you our software, so you can keep control of your own data.

You can deploy Cantabular internally and use our productivity tools and APIs, or make it available on the web and connect to your own or our publication systems and apps.

Our novel Disclosure Rules Language can be used to create automated disclosure checks that fit the structure and content of your data.

A screenshot of the administration UI with an example dataset loaded

Privacy and security

Protect your data and unlock its value

Cantabular provides an array of different data privacy techniques, implemented in a system designed to be secure at every level.

Use our configurable implementation of the cell-key method to add random noise to your tabular outputs, for rounding small counts or for primary suppression.

Our Disclosure Rules Language allows you to implement custom query and output table checks to automatically block disclosive queries and tables.

Our software team can also work with you to implement other privacy algorithms that you need and with your IT teams to securely deploy our software.

  • Cell-key perturbation
  • Zero-perturbation with structural zeros
  • Rounding
  • Primary suppression
  • Automated query checks
  • Customisable output table checks

Speed and performance

Get your data back in seconds, not minutes

Cantabular has been built to be fast. We use our own data format and query algorithms to ensure lightning performance.

This speed opens new possibilities: real-time disclosure control in a flexible dissemination system, or pre-computation of millions of tables in a realistic timeframe.

Cantabular is designed to operate behind an auto-scaling load balancer to enable it to operate at scale as well as speed and handle hundreds of requests per second.

Note: Query timings shown are indicative numbers for a single query of age by sex by low-level geography on 60 million rows of data. Cantabular timing includes disclosure control methods; database timings are optimistic because they only measure time to make tables without disclosure control applied.

Cantabular

1.5 seconds

Postgres

~1 minute

SQLite

~5 minutes

SAS

~10 minutes

Consistent and repeatable

Integrate Cantabular into your production chain for consistent, repeatable results

Cantabular is designed to operate as part of a reproducible analytical pipeline to allow reliable analysis of data, free from copy-paste errors.

Our small, secure and fast API can be connected to your own systems and pipelines to allow automated verification and production of outputs.

An open source Python API wrapper, and our OpenAPI specification, let your data analysts easily work with our API to speed up analysis through code editors or interactive notebooks like Jupyter.

A screenshot of the administration UI with an example dataset loaded

The Office for National Statistics in the UK has selected Cantabular and its real-time privacy protection capability to help disseminate Census 2021 data.

Pricing

We operate a fair pricing model which depends on the volume of data being processed per annum, together with a minimum licence fee. There is no charge for multiple servers or for user licenses.

Please get in touch if you would like to discuss this further with us.

Request a demo

Enter your email address below and we’ll get in touch to arrange a chat.

You can also check out our live demo using the 1911 Irish census.

Get in touch to find out more