When seeking answers to health-related questions, it’s essential to exercise caution, especially when using AI platforms like ChatGPT. Researchers from Montreal’s Centre hospitalier universitaire Sainte-Justine (CHU Sainte-Justine) and the Montreal Children’s Hospital conducted a study that underscores the need for vigilance. They engaged ChatGPT in responding to 20 medical inquiries and found that the AI’s answers were often of dubious quality, containing factual errors and fabricated references.
Challenges in AI-generated medical information
Dr. Jocelyn Gravel, the lead author of the study and an emergency physician at CHU Sainte-Justine, emphasized the alarming nature of these findings, particularly in relation to the trustworthiness of scientific communication. He expressed concern about the potential impact of incorrect medical information provided by AI platforms like ChatGPT. Dr. Gravel noted that users should exercise caution when incorporating AI-generated references into their medical manuscripts.
Researchers’ approach and findings
For this groundbreaking study, researchers sought to evaluate the quality and accuracy of references provided by ChatGPT. They derived their questions from existing medical studies and then asked the AI model to substantiate its responses with relevant references. The researchers, then, had the answers assessed by the original authors of the source articles from which the questions originated. Seventeen authors participated in this assessment, rating the responses for their quality and accuracy.
The results were concerning. The responses received from ChatGPT were rated as having questionable quality, with a median score of 60%. Additionally, major and minor factual errors were detected. An example included ChatGPT suggesting the administration of an anti-inflammatory drug via injection instead of ingestion. Another significant error was the AI’s tenfold exaggeration of the global mortality rate linked to Shigella infections.
Invented references and their deceptive nature
An alarming discovery was the prevalence of invented references provided by ChatGPT. A staggering 69% of the references were entirely fabricated, yet they appeared to be authentic at first glance. The fabricated references utilized the names of established authors and reputable organizations such as the U.S. Centers for Disease Control and Prevention or the U.S. Food and Drug Administration. These fictitious references featured titles relevant to the subject matter and even used names of well-known publications or websites. Furthermore, even the genuine references were found to contain errors, adding to the complexity of the issue.
When questioned about the accuracy of its references, ChatGPT’s responses raised concerns. The AI model provided responses that indicated its intention to offer accurate and up-to-date information but also acknowledged the possibility of errors or inaccuracies. This uncertainty in AI-generated information adds to the challenges of relying on such technology for critical matters like health and medical guidance.
The importance of accurate references in Science
Dr. Esli Osmanlliu, an emergency physician at the Montreal Children’s Hospital and a scientist from the Child Health and Human Development Program at the Research Institute of the McGill University Health Centre, emphasized the indispensable role of correct references in scientific research. Accurate references not only reflect a thorough literature review but also enable researchers to integrate their findings into the broader context of previous work. Dr. Osmanlliu highlighted the potential fraudulent implications of creating fake references, indicating that such practices could undermine the integrity of research.
The study’s researchers underscored the potential dangers of relying on AI-generated references for medical information. Clear and seemingly credible references could mask low-quality content, leading researchers and professionals astray. The alluring appearance of reliable sources could lull users into accepting potentially erroneous or misleading information.
In an era where AI holds promise in various domains, including healthcare, this study serves as a stark reminder of its limitations and the importance of critical evaluation, particularly when dealing with matters as crucial as medical information. As technology evolves, users must balance the convenience of AI with the necessity for accurate, verifiable, and trustworthy information.