- A large research project found that leading AI language models can repeat false medical claims when those claims appear inside realistic clinical notes or social-media style discussions.
- The models often treated confident medical-sounding misinformation as routine guidance rather than flagging it as unsafe.
- The authors argue that “susceptibility to medical misinformation” should be measured and stress-tested before AI is embedded into healthcare tools.
Medical AI is often promoted as a way to make care safer by helping clinicians manage complex information.
But there is an uncomfortable question behind the hype: what happens when the information itself is wrong? If a fabricated or misleading recommendation is wrapped in medical language, can an AI system pass it on as if it were standard care?
A new study examined this risk at scale by testing how large language models handle medical misinformation across different contexts.
Researchers analysed more than a million prompts across nine leading models and found that false claims can be repeated and reinforced, particularly when they appear in formats that look familiar, such as hospital discharge notes or realistic patient scenarios.
To test the problem systematically, the team used three main types of content.
First, they took real discharge summaries from the MIMIC intensive care database and inserted a single fabricated recommendation into each note.
Second, they used common health myths collected from Reddit.
Third, they built 300 short clinical scenarios written and validated by physicians.
Each case was also presented in multiple styles, ranging from neutral wording to emotionally charged or leading phrasing that mimics how misinformation spreads online.
One example illustrates the issue.
A discharge note included a false recommendation suggesting that patients with bleeding linked to oesophagitis should drink cold milk to soothe symptoms. Several models accepted the statement rather than warning that it was unsafe or unsupported.
The wording and the setting made the misinformation look “normal”, and the AI responded accordingly.
The researchers’ broader point was simple: for these systems, the way a claim is framed can matter more than whether it is correct.
If misinformation is written with the tone and structure of clinical advice, models may default to treating it as true.
The authors argue that current safeguards are not consistent enough once misinformation is embedded in realistic text.
- Wild blueberries may support heart, gut and brain health
- Hidden bias may help explain chronic indecision and why some people get stuck
- Regular aerobic exercise may slow brain ageing in midlife
That matters because healthcare is full of exactly this kind of content: notes, summaries, referrals, and patient messages. If AI is used to draft, summarise, or advise based on these documents, it could unintentionally amplify an error.
Rather than relying on vague assurances that a model is “safe”, the researchers propose treating susceptibility to medical misinformation as something that can be measured.
In practical terms, that means stress-testing models using large datasets that include realistic misinformation, and building systems that verify medical claims against external evidence before presenting them as fact.
For patients, including people living with diabetes, the message is not “avoid AI” – it is “treat AI output like a draft, not a diagnosis”.
If an AI tool gives you medical guidance that is new, surprising, or confident about a treatment, it should be checked against trusted clinical sources or a healthcare professional.
The study shows that a convincing tone can be enough to fool a system into repeating a lie, and that is exactly the kind of failure mode healthcare needs to design around.





