Artificial intelligence (AI) models in healthcare have shown promise in improving diagnostic decisions, but a recent study conducted at the University of Michigan reveals a concerning aspect. While accurate AI models enhance diagnostic accuracy for some demographics, biased models, fueled by skewed medical data, can lead to serious declines in decision-making. Despite efforts by regulators, such as the U.S. Food and Drug Administration (FDA), to ensure the safety and transparency of AI models, a study published in JAMA suggests that clinicians may still be misled, even when provided with AI explanations.
The challenge of biased AI models
The research focused on AI models and explanations related to patients with acute respiratory failure, a condition with complex diagnostic challenges. Clinicians, including hospitalist physicians, nurse practitioners, and physician assistants, were involved in the study. The baseline diagnostic accuracy of clinicians was approximately 73%, emphasizing the difficulty in determining the causes of respiratory failure.
AI Assistance and Diagnostic Accuracy
In the study, clinicians were asked to make treatment recommendations based on their diagnoses for patients with respiratory failure. The research team evaluated the diagnostic accuracy of 457 healthcare professionals with and without assistance from an AI model. Half of the participants received an AI explanation along with the AI model’s decision, while the other half received only the AI decision without an explanation.
The results indicated that clinicians who received assistance from an AI model, even without explanations, experienced a 2.9 percentage point increase in their diagnostic accuracy. When provided with explanations, their accuracy improved by 4.4 percentage points, demonstrating the potential of AI to enhance clinical decision-making.
Unveiling the Pitfalls: Bias in AI ,models
To test whether explanations could help clinicians identify biased or incorrect AI models, the research team intentionally presented clinicians with models that were trained to be biased. For example, a model predicted a high likelihood of pneumonia if the patient was 80 years or older, showcasing potential pitfalls in AI model training.
The findings revealed a significant decline in accuracy when clinicians were exposed to biased AI models. In instances where explanations explicitly highlighted that the AI was relying on non-relevant information, such as associating low bone density in patients over 80 with pneumonia, clinicians did not recover from the decline in performance. This aligns with concerns about users being deceived by AI models, emphasizing the risks associated with biased training data.
The Role of AI Explanations
While AI explanations were shown to improve diagnostic accuracy in general, the study underscores the challenges in developing effective explanation tools. Clinicians need to understand not only the model’s decision but also the underlying reasoning conveyed by the explanation. The study highlights the need for interdisciplinary discussions to refine explanation tools and ensure effective communication between AI models and clinicians.
Implications for Future Research and Implementation
The University of Michigan research team hopes that their study will stimulate more research into the safe implementation of AI-based models in healthcare, considering the diverse demographics of patients. Additionally, the study emphasizes the need for comprehensive medical education surrounding AI and bias. As AI continues to play a pivotal role in healthcare decision-making, addressing these challenges becomes crucial to ensure patient safety and improved outcomes.
The dual nature of AI in healthcare, as both a beneficial tool and a potential source of bias, necessitates ongoing efforts to strike a balance. Regulators, developers, and clinicians must collaboratively work to enhance the transparency of AI models, develop robust explanation tools, and provide thorough education on AI and bias. As the healthcare industry increasingly integrates AI, addressing these challenges is paramount to building trust in AI applications and ensuring that they contribute positively to clinical decision-making.