Artificial intelligence (AI) models have demonstrated remarkable proficiency in the field of evidence-based medicine (EBM), offering a promising solution to the challenges faced by healthcare practitioners in staying up-to-date with the latest research findings. A recent study conducted by researchers at Mount Sinai’s Icahn School of Medicine has shed light on the potential of large language models (LLMs) in revolutionizing independent medical practice.
AI models and evidence-based medicine
Evidence-based medicine involves utilizing the best available research evidence to make patient clinical decisions, moving away from traditional methods and personal beliefs. In today’s rapidly evolving medical landscape, keeping pace with the influx of new research is a formidable task for healthcare professionals. However, the study suggests that AI chatbots, particularly ChatGPT-4, could offer a promising solution to this complexity.
The research team tested the capabilities of various AI models, including OpenAI’s ChatGPT, Gemini, LLAMA v2, and Mixtral-8x7B. These models were given access to previously curated case files and tasked with making clinical decisions based on the available data. The researchers assessed their performance using several metrics.
ChatGPT-4 leads the way
In their report, the researchers evaluated the AI models’ resistance to hallucinations, the validity of their clinical decisions, and their adherence to clinical guidelines. The standout performer in this study was ChatGPT-4, which exhibited the most capability to function in a clinical setting without human intervention, surpassing other LLMs.
According to the report, “LLMs can function as autonomous practitioners of evidence-based medicine.” It highlights their potential to interact with real-world healthcare systems and manage patient tasks following established guidelines.
Despite the impressive performance of LLMs in EBM, the study identified several areas requiring improvement in their operations. One significant limitation is that mainstream LLMs often have a training cutoff in 2021, rendering them unaware of new medical data beyond that date. The report notes that updating these models with new medical information is a costly endeavor that may hinder their practical application.
Furthermore, there is a concern about the risk of hallucinations when asking LLMs to generate information on unfamiliar medical subjects. Additionally, there is a lack of data on cultural considerations and antibiotic resistance, which could impact the accuracy of clinical decisions.
Innovative solutions
To enhance the performance of LLMs in EBM, the researchers introduced a new tool called Retrieval Augmented Generation (RAG). This approach involves providing task-specific information to AI models, effectively improving the quality of their responses.
Prompt engineering was identified as another method to refine LLM responses. By instructing the LLM with specific information, such as “You are a professor of medicine,” the researchers found that responses became more tailored to the patient and the healthcare system.
The researchers acknowledge limitations in the models’ ability to handle complex guidelines and diagnostic nuances but believe that Retrieval Augmented Generation can help address these issues, making recommendations more patient-centric and adaptable to healthcare systems.
AI and medicine in a promising future
Integrating emerging technologies such as AI and blockchain is rapidly advancing in medicine and public health. Research is underway to explore AI’s potential in cancer detection and epidemic tracking areas.
For AI to thrive and operate within the boundaries of the law while addressing growing challenges, experts suggest integrating an enterprise blockchain system. Such a system would ensure data input quality and ownership, safeguarding data integrity and immutability.
The Mount Sinai study highlights the potential of AI, particularly ChatGPT-4, in transforming evidence-based medicine. While challenges exist, innovative solutions like Retrieval Augmented Generation offer promising ways to enhance the performance of AI models in clinical settings. As technology continues to evolve, the future of AI in healthcare holds great promise for improving patient care and clinical decision-making.