← NewsAll
AI chatbots give misleading medical advice in about half of responses, study finds
Summary
A BMJ Open study found that five popular AI chatbots produced problematic medical advice in roughly 50% of responses, with nearly 20% judged highly problematic.
Content
A new study published in BMJ Open assessed how five popular AI chatbots respond to medical questions and found widespread problems. Researchers from the US, Canada and the UK asked each platform 10 questions across five health categories to evaluate performance. The issue is notable because many people seek health guidance from these systems, which are not licensed to give medical advice.
Key findings:
- The researchers tested ChatGPT, Gemini, Meta AI, Grok and DeepSeek using 10 prompts across five health topics.
- About 50% of the chatbots' responses were judged problematic, and nearly 20% were considered highly problematic.
- Performance was stronger on closed-ended questions and topics such as vaccines and cancer, and weaker on open-ended prompts and areas like stem cells and nutrition.
- Answers were often delivered with confidence; no system produced a fully complete and accurate reference list, and there were only two refusals to answer (both by Meta AI).
Summary:
The study highlights that popular AI chatbots can produce authoritative-sounding but potentially flawed medical responses, which raises concerns about misinformation and public-facing use. Study authors call for a reevaluation of how these systems are deployed in health communication and for greater oversight. Specific regulatory or procedural next steps were not detailed in the report.
