Integrating AI into Official Statistics

Official statistics are underpinned by three core pillars: trustworthiness, quality, and value, as outlined in the UK Code of Practice for Statistics. Produced independently using robust methodologies and reliable data, these statistics exist to serve the public good.

While headline figures from flagship surveys like the Census often capture media attention, much of the data remains embedded in complex spreadsheets and technical reports—accessible primarily to those with statistical expertise.

The Promise and Perils of LLMs

Recently, large language models (LLMs) such as ChatGPT have demonstrated significant potential in making statistical data more accessible. By responding to natural-language prompts, these models can extract relevant statistics and provide conversational insights, enhancing the value of official statistics.

However, LLMs lack the inherent trust and quality assurance that official statistical producers uphold. They are prone to hallucinations, may overlook methodological nuances, and can present figures without appropriate context.

Embracing AI: A Strategic Imperative

Despite these challenges, LLMs are becoming integral to how individuals access information. Their ease of use and ability to provide quick insights mean that many users prefer querying a chatbot over navigating complex datasets.

This shift presents both a challenge and an opportunity for National Statistical Institutes (NSIs). Rather than dismissing AI outputs as unreliable, NSIs must engage proactively with these technologies to ensure that authoritative statistics remain visible, trusted, and usable in an AI-driven landscape.

Global Initiatives and Collaborations

International organizations are already taking steps to integrate AI into official statistics:

The UNECE encourages NSIs to explore LLMs through small, experimental projects, recognizing their potential while emphasizing the need for caution and evaluation. (UNECE LLM paper)
The European Statistical System (ESS) has launched the Artificial Intelligence and Machine Learning for Official Statistics (AIML4OS) initiative, aiming to develop tools and guidance for integrating AI into statistical processes.
The World Bank advocates for making official statistics AI-ready, emphasizing consistent formats and metadata to enable seamless interaction with AI tools. (World Bank on AI-ready stats)

Cantabular’s Approach: Aligning Innovation with Integrity

At Cantabular, our prototype Table Assist exemplifies a responsible and practical use of LLMs in official statistics. By interpreting natural-language queries and generating precise data requests without allowing the LLM to alter the underlying statistics, we ensure transparency, accuracy, and user control. Users can refine their queries iteratively, and every result includes a direct link to the original query, reinforcing full traceability. This approach harnesses the power of AI to improve accessibility while preserving the integrity and trustworthiness that official statistics demand.

Watch a short screencast demonstrating how Table Assist lets you quickly and easily find the data you need using natural language within the Cantabular UI.

Importantly, this philosophy aligns closely with the broader direction National Statistical Institutes (NSIs) and other statistical organizations are pursuing globally: leveraging LLMs as intelligent assistants and query builders within structured, human-in-the-loop frameworks. Such strategies strike a balance between innovation and oversight, ensuring AI enhances official statistics without compromising quality or trust.

Screenshot: Using Table Assist to get data from Cantabular

Human-in-the-Loop: Balancing Efficiency and Oversight

Recognizing the risks of fully autonomous AI outputs, several organizations have developed human-in-the-loop tools—systems that leverage LLMs to generate initial drafts or insights, reviewed and validated by experts.

Notable examples include:

The Bank for International Settlements (BIS) has developed a metadata editor powered by LLMs to assist in drafting statistical metadata. (BIS metadata editor)
Statistics Canada created a tool that automatically generates reports from statistical tables. (Statistics Canada LLM paper)
The Deutsche Bundesbank launched Text-based Intelligent Assistants (TIAs), allowing staff to summarise documents, generate content from bullet points, and more within a secure interface. (Bundesbank TIAs)

Experimenting with RAG: Chatbots for Statistical Content

Some organisations have tested retrieval-augmented generation (RAG) systems to make official statistics more user-friendly. These tools enable users to query content from documents like PDFs on government websites.

The ONS developed StatsChat. (ONS StatsChat concerns)
Statistics Canada developed IntelliStatCan.

Both prototypes used classic RAG architectures, retrieving data from statistical documents to generate responses. However, they were only available for a limited time. The ONS paused StatsChat due to concerns it might return irrelevant, offensive, or politically sensitive content—underscoring the reputational risks of deploying LLMs in public-facing contexts.

These early trials highlight both the potential and challenges of integrating LLMs into public statistics, particularly the need to balance accessibility with accuracy and institutional values.

From Natural Language to Reliable Data: LLMs as Query Builders

A promising approach avoids freeform generation and instead uses LLMs to help users access existing data sources via structured queries.

Examples include:

StatBot.Swiss – Developed by Zurich University of Applied Sciences, the Swiss Data Science Center, and the Federal Statistical Office, this tool enables bilingual querying of open data by translating natural language into SQL. (StatBot.Swiss paper)
StatGPT – A collaboration between EPAM Systems and the International Monetary Fund (IMF), this tool interprets user questions and converts them into structured queries against an SDMX-compliant API. (StatGPT project)
DataGemma – Developed by Google, this tool integrates Gemini outputs with Google Data Commons to reduce hallucinations and enhance factual accuracy. (Google DataGemma blog)

These applications demonstrate how LLMs can act as intelligent intermediaries—translating human questions into structured, verifiable outputs—offering a responsible path forward in the AI-statistics space.

Conclusion

The integration of large language models into the world of official statistics presents both exciting opportunities and significant challenges. As this technology reshapes how people access and interact with data, National Statistical Institutes and other producers must adapt proactively—embracing AI to enhance accessibility while safeguarding the core pillars of trustworthiness, quality, and value.

At Cantabular, our Table Assist prototype exemplifies a balanced and practical path forward: leveraging LLMs to improve usability and user experience without compromising the accuracy or integrity of the statistics themselves. This approach aligns with global efforts to harness AI as a supportive tool—augmenting human expertise through transparent, auditable processes that maintain the high standards the public expects.

Ultimately, the future of official statistics lies not in resisting AI, but in shaping its application responsibly and collaboratively. By doing so, statistical organizations can ensure that authoritative data remains a trusted cornerstone of public knowledge in an increasingly AI-driven world.