Government consultations are one of those things most people ignore until the topic suddenly matters to them. A local road change, new transport rules, driving test booking reform, rail access, bus services, clean air plans, pavement rules, parking policy, airport expansion, the list goes on. When people do respond, they usually want one basic reassurance: that a real person will properly hear what they said. That is why a new Department for Transport story about using Google Cloud AI to analyse consultation responses is interesting, and a little sensitive too.
The Department for Transport, or DfT, says it now uses a Consultation Analysis Tool, known as CAT, to help sort and analyse huge volumes of free-text feedback. According to Google’s account, the department runs about 55 public consultations a year, some generating more than 100,000 written responses. The new system, built with the Alan Turing Institute and using Google’s Gemini models on Vertex AI, is meant to identify themes in those responses in hours rather than months. DfT says the approach could save up to £4 million a year.
On its own, that could sound like yet another vague promise that AI will make paperwork disappear. But this case is more concrete than most. DfT has already published a technical evaluation on GOV.UK, and one consultation response on driving test booking rules explicitly says free-text comments were analysed using the CAT. In that consultation, officials said more than 102,000 responses were received, 8,803 automated or bot responses were removed, and 93,421 responses were then analysed. That matters because it shows the tool is not just a pilot on a slide deck. It is already shaping how a live public consultation is processed.
For ordinary UK readers, the question is not whether civil servants enjoy theme mapping. It is whether an AI-assisted process can still treat public feedback fairly. If you take time to write about a transport problem in your area, you do not want your point flattened into a generic label, grouped with the wrong concern, or quietly buried because the software found the wrong pattern. DfT seems aware of that risk. Its published evaluation stresses that the tool works with “robust human oversight”, and Google says the department uses a human-in-the-loop model so policy teams can review outputs for accuracy, fairness and bias before decisions are made.
That human oversight point is the whole story. If AI is being used as a sorting and summarising assistant, with people still checking themes, challenging weak output and making the final call, many readers will probably see that as a practical use of the technology. Governments deal with large volumes of repetitive material, and there is a real public interest in getting consultation summaries out faster. DfT says departments are expected to publish consultation responses within 12 weeks, and anyone who has waited months for an official response will understand why speed matters.
But the same story becomes less comfortable if “AI helps us cope with scale” drifts into “AI more or less decides what the public meant”. That is where trust can break. Large language models are good at spotting patterns and producing neat summaries, but neat is not always the same as faithful. We have seen in other settings that AI can sound confident while missing nuance, softening disagreement or over-compressing messy human views. That is one reason we keep coming back to source-checking and verification, including in our earlier piece on why trustworthy source habits still matter when AI offers convenient summaries.
There is also a democratic angle here. Consultations are not customer service tickets. They are one of the ways the public, campaign groups, experts and affected communities try to influence policy before decisions are finalised. If AI is now part of that pipeline, government departments will need to explain clearly what the systems do, what they do not do, how outputs are checked, and what protections exist against bias or over-simplification. To DfT’s credit, publishing a technical evaluation is a better start than simply declaring the tool accurate and moving on.
So what should you do differently as a reader or respondent? Probably not much, but a few habits become more important. First, be specific. Clear examples, concrete concerns and plain wording are easier for both humans and machines to interpret accurately. Second, do not assume every nuance of your argument will survive summarisation, so make your strongest point unmistakable. Third, when a consultation outcome is published, skim the summary and the next steps to see whether the department’s account broadly reflects the concerns people raised. If something looks oddly compressed or missing, that is useful to notice and challenge.
The broader lesson is that AI in government will often arrive quietly, inside admin-heavy processes rather than flashy public launches. That can be good if it reduces delays and helps officials deal with massive workloads. It can also be risky if the public is expected to accept automation without enough explanation. In this case, the sensible response is neither panic nor blind trust. It is to expect transparency, meaningful human review and a clear explanation of how your words are being handled. If government wants people to keep taking consultations seriously, that reassurance is not optional. It is part of the job.
