Why AI in healthcare needs a safety net

A safety net for clinical documentation - accurately validate AI outputAmbient voice technology is transforming NHS documentation. Instead of spending hours typing up notes after patient consultations, clinicians can now speak naturally and have AI convert their words into structured medical documents. It’s a game-changer for reducing administrative burden, but there’s a catch that few people talk about.

A recent BMJ survey revealed that one in five GPs are already using generative AI tools like ChatGPT in their clinical practice, despite a lack of formal guidance or clear work policies. Of those using AI, 29% employ it to generate documentation after patient appointments and 28% use it to suggest potential diagnoses.

When we started exploring AI for clinical documentation in 2022, we discovered something fascinating,” explains Dr Andrew Whiteley, Managing Director of Lexacom. “Large language models are incredibly good at being helpful, sometimes too helpful. They’ve been trained to recognise patterns and provide useful summaries, which is brilliant for most applications. But in medicine, there’s a fundamental difference between recording what a patient says and interpreting what it means.

 


The helpful AI problem

Think about how you might describe a drink to someone, if you said, “juniper based spirit with carbonated quinine water“, most people would helpfully summarise that as “a gin and tonic“. Large language models do the same thing, they’ve learned this is the ‘helpful’ response, recognising patterns and generating the most probable response.

In everyday contexts, this is fantastic. In medical documentation, it becomes problematic. When a patient describes symptoms like chest discomfort and shortness of breath, the AI might helpfully write “patient presents with cardiac symptoms” in the notes. It’s made a reasonable connection, but it’s crossed an important line, only qualified clinicians should interpret symptoms and suggest diagnoses.

This isn’t the AI making things up or ‘hallucinating’ in the traditional sense. It’s doing what it was designed to do, being helpful by recognising patterns. The challenge is that medical records need to be precise, not helpful interpretations.

But there’s another, equally serious problem: what AI leaves out. “A model might brilliantly summarise a 30-minute consultation into a concise letter, but if it’s dropped the patient’s penicillin allergy or their history of adverse reactions, that ‘helpful’ summary becomes dangerous”, notes Dr Whiteley. “The AI thinks it’s being helpful by condensing information, but in medicine, what’s left unsaid can be as dangerous as what’s wrongly added.”

CertifAI®, Lexacom’s certification technology for AI-generated medical documentation

Why simple solutions don’t work

If you have used the popular AI chat bots such as ChatGPT or Google Gemini you might think the answer is straightforward, just tell the AI not to add anything or make interpretations. Unfortunately, it’s not that simple. These behaviours happen at such a fundamental level in how AI processes language that instructions like “don’t hallucinate” or “stick to the facts” simply don’t work. It’s rather like telling someone not to recognise a pattern they automatically see; the recognition happens before conscious thought.

We spent months testing different approaches,” says Dr Whiteley. “Every major large language model we tested, regardless of how we instructed it, would still make these helpful inferences when converting consultations into clinical documents. That’s when we realised we needed a completely different approach, not trying to stop AI being helpful, but adding a certification layer that could validate when outputs remained faithful to the original.

 


CertifAI® from Lexacom validates AI outputs with 99.9% accuracy

CertifAI® from Lexacom validates AI outputs with 99.99% accuracy

This insight led to the development of CertifAI®, Lexacom’s certification technology for AI-generated medical documentation. Rather than trying to prevent AI from making inferences, which proves nearly impossible, CertifAI checks the AI’s work afterwards, comparing the original consultation transcript against the generated document, to spot where interpretation may have crept in.

In testing, CertifAI achieved 99.99% accuracy in detecting when AI had added diagnostic conclusions that weren’t explicitly stated in the original consultation. Just as importantly, it identifies when critical information has gone missing, from dropped symptoms to omitted safety instructions. Powered by Lexacom’s Comprehension Engine® with its deep medical understanding, the CertifAI layer understands the difference between acceptable medical notation and dangerous changes, minimising false positives. For instance, it recognises that abbreviating ‘glyceryl trinitrate spray‘ to ‘GTN spray‘ is standard clinical practice, not a hallucination or omission error.

Moving forward with confidence

The NHS needs AI to help solve its documentation crisis. Clinicians are drowning in paperwork, and ambient voice technology offers genuine relief. But as with any powerful tool in healthcare, appropriate safeguards are essential. The BMJ researchers concluded that “doctors and medical trainees need to be fully informed about the pros and cons of AI, especially because of the inherent risks of inaccuracies (‘hallucinations’), algorithmic biases, and the potential to compromise patient privacy“.

The market is suddenly flooded with ambient voice technology suppliers, many launching with no prior healthcare experience, some that didn’t even exist two years ago, all promising to revolutionise clinical documentation. But healthcare isn’t just another vertical to disrupt; it requires deep understanding of clinical workflows, patient safety, and the unique demands of medical documentation.

“AI will revolutionise clinical documentation – but it needs to be done safely”


– Dr Andrew Whiteley

We’re not anti-AI, quite the opposite” emphasises Dr Whiteley, “We’ve been implementing transcription workflow software, including speech recognition, in healthcare for over 25 years. We believe AI will revolutionise clinical documentation. But we also believe it needs to be done safely, with proper verification that understands the unique requirements of medical records.

The good news is that with the right verification approach, NHS organisations can confidently deploy AI documentation tools. Clinicians can speak naturally, knowing the resulting documents will be both efficient and accurate. The technology preserves what makes AI valuable, its ability to structure and format information quickly, whilst ensuring medical accuracy and completeness.

As ambient voice technology becomes standard across the NHS, the conversation needs to shift from whether to use AI to how to use it safely. With proper verification in place, we can finally deliver on AI’s promise, giving clinicians their time back to focus on what matters most, their patients.

 


Further reading

logo