Fictional Footnotes

Fictional Footnotes

Every so often, a story comes along that perfectly captures the quiet chaos of our AI age, not a sci-fi meltdown or a rogue robot, but something far more mundane: a report, an email, a document that looks good enough to send but was completely wrong.

This week, that story came from Australia. Deloitte Australia agreed to refund part of a £215,000 government report after it was found to contain AI-generated fabrications including made-up academic papers, false citations, even a fictional quote from a federal judge.

The irony of a consultancy trusted to review how automation was being used in welfare systems ending up with an erroneous report because they used automation. Brilliant.

We live in a time when credibility can be manufactured in seconds. AI can generate a page of references, citations, and “expert sources” that look impeccable, formatted perfectly, and filled with the right keywords and institutional names. And if teams are palming off this work as their own but they don’t understand the actual content, they run the risk of sending out nonsense that doesn’t stand up to scrutiny.

When I first started experimenting with large language models, I noticed early on that they’ll happily invent things if you don’t check their work. They’re trained to sound right, not be right. Just like they tended [and still do to a point] to agree with whatever your opinion is, regardless of you technical level.

And that’s the issue really, it’s the standing on the shoulders of giants but when you are not in command of the basic facts you have asked AI to “build on” then there’s no way you can review / edit or critique what comes out.

The temptation, when you hear about AI “hallucinations,” is to blame the model. But in this case, as in all cases, the accountability doesn’t sit with the machine. It sits with the human who used it, and the organisation that approved it.

AI can draft, summarise, and even find patterns we’d miss. But it can’t take responsibility. It can’t say, “I checked that case law myself.”

That’s our job.

This isn’t really an AI failure, it’s a human process failure. Deloitte’s team used Azure OpenAI to assist in writing the report. Fair enough. But at some point, the sense-checks, fact-checks, and human oversight failed.

What’s striking here isn’t the presence of AI. It’s the absence of due diligence.

When you or I use ChatGPT or Claude or Gemini, we (hopefully) treat their outputs as starting points, the whole “treat it like an intern” philosophy. It will give you drafts to be improved, ideas to be tested, perspectives to challenge our own thinking.

But when you remove that sceptical step, when you take the output at face value, you risk dressing up fiction as fact.

And when that happens in an official report to government, it’s not a “hallucination.” It’s malpractice.

I get it, though. I really do. Once you’ve worked with AI tools for a while, the convenience can be intoxicating. Reports that used to take days can be done in hours. Drafts that once took a week now take an afternoon.

It’s easy to see why organisations, even ones with Deloitte’s resources, fall into the trap. When everything looks polished, it feels correct.

The AI never gets tired. It never misses a deadline. It doesn’t ask for clarification or second opinions. It just produces.

But therein lies the danger.

The ease of production has made it harder to slow down. To pause. To sense-check.

We’ve become so used to the fluency of machines that we risk losing the very instinct that keeps us from blindly trusting them: critical thinking.

I have a personal motto or mantra “do hard things”. The reason is to build up a personal resilience to hard things by doing them, normalising what used to be “hard”. It usually applies to fitness or endurance, but it fits here too. The mental equivalent of exercise is doing the slow, uncomfortable work of thinking for yourself cross-checking, questioning, and verifying.

In a world where AI can spin a thousand words faster than you can boil the kettle, the new form of discipline isn’t speed; it’s restraint.

That’s what went missing in this Deloitte case: the discipline to stop and ask, has someone signed this off? does this actually make sense?

The irony is that the report was about automation itself, how welfare systems use algorithms to issue penalties. A report critiquing automation was undone by its own over-reliance on automation.

Deloitte, did review the document after the errors were pointed out, corrected the issues, refunded the final payment, and republished a revised version with a disclosure about AI use.

But that doesn’t erase the underlying issue.

In a world where we already struggle to tell what’s real, when one of the world’s biggest consultancies can submit a fabricated quote from a judge to a government department, it chips away at the bedrock of professional trust.

If the “experts” can’t be relied upon to verify what they deliver, and if those same “experts” are using AI tools without understanding their limitations how does anyone else stand a chance?

That’s why sense-checking isn’t optional and never was. It’s something anyone using GenAI for producing information must have baked-in from the start.

I spend a lot of time using AI tools, for work, for writing, for creative projects. But I also spend just as much time testing them. Comparing outputs. Cross-referencing facts. Rewriting things by hand to see if they still make sense without the gloss.

That friction is where the value lies.

You don’t become good at using AI by trusting it. You become good by challenging it, by learning where it’s weak, where it bluffs, and where it shines.

In other words: by staying in the loop.

The best AI users aren’t the ones who get perfect answers. They’re the ones who catch the subtle errors before anyone else notices them.

In hindsight, the Deloitte story will likely fade into another headline in the long list of AI missteps. But it shouldn’t.

It’s a cautionary tale of what happens when competence gives way to convenience. The future won’t be defined by who uses AI, but by who uses it well.

Because the real danger isn’t AI making mistakes.

It’s humans forgetting how to notice them.

Not the 9 O'Clock News. Building an aggregator you can trust

Not the 9 O'Clock News. Building an aggregator you can trust