An AI revolution is underway in the field of medicine, promising to reshape healthcare as we know it. Emerging generalist models, known as foundation models, are poised to overcome the limitations of first-generation machine-learning tools in clinical applications. These models hold significant potential, with big tech companies already investing in their development and integration into medical imaging and diagnostics.
Foundation models in medicine
Foundation models represent a versatile framework for advancing medical AI. One key advantage is their adaptability to various medical specialties. Ophthalmology, for instance, stands out as a prime candidate for foundation models due to the availability of high-resolution imaging data for nearly every part of the eye. These models have the potential to revolutionize medical specialties by enhancing diagnostic accuracy and efficiency
Major technology companies are actively investing in medical imaging foundation models that leverage diverse image types, such as skin photographs, retinal scans, X-rays, and pathology slides. These models also incorporate electronic health records and genomics data, providing a holistic view of patient health. In June, Google Research unveiled REMEDIS (robust and efficient medical imaging with self-supervision), a groundbreaking approach that significantly improved diagnostic accuracies compared to supervised learning methods. This approach relies on pre-training models with large datasets of unlabeled images, reducing the need for labeled data.
Google’s multimodal approach
Google researchers have gone a step further by combining REMEDIS with Med-PaLM, their large language model. This integration resulted in Med-PaLM Multimodal, a single AI system capable of interpreting medical images, such as chest X-rays, and generating natural language medical reports. This multimodal approach represents a leap forward in medical AI capabilities, offering a blend of image interpretation and textual analysis.
Microsoft’s language and vision integration
Microsoft is also actively working on integrating language and vision in a unified medical AI tool. Their creation, LLaVA-Med (Large Language and Vision Assistant for biomedicine), was trained on images paired with text extracted from PubMed Central, a comprehensive database of biomedical articles. This innovative approach enables AI systems to engage in conversations with images, similar to human interaction with text-based AI systems like ChatGPT. However, this approach requires vast quantities of text-image pairs, with Microsoft’s team collecting over 46 million pairs from PubMed Central.
Unlocking unseen patterns
As foundation models are trained on ever-expanding datasets, there’s growing optimism that they can uncover patterns and insights that may elude human observers. For instance, Google’s 2018 study demonstrated AI models’ ability to identify characteristics like age and gender from retinal images, a feat beyond even experienced ophthalmologists. This potential to unveil scientific information embedded in high-dimensional images holds promise for various medical applications.
One domain where AI tools could surpass human capabilities is digital pathology for predicting tumoral responses to immunotherapy. AI systems can analyze vast patient data, identifying patterns in exceptional responders and non-responders. These insights could revolutionize treatment strategies, offering tailored therapies based on an individual’s unique tumor microenvironment. However, while the diagnostic potential of AI is exciting, it’s essential to set high standards for their success.
Despite their remarkable capabilities, even the best-performing AI models in medical imaging still fall short of human radiologists. An X-ray report by a human radiologist remains superior to state-of-the-art multimodal generalist medical systems. Ensuring the safe and responsible use of foundation models in clinical care remains a paramount concern. While the applications of these models are vast, rigorous testing and validation are necessary before widespread clinical implementation.
Training for the future
Many experts believe that AI will continue to play a growing role in medicine, but its role won’t be to replace medical professionals. Instead, it will complement their expertise. Initiatives like free AI literacy courses for radiologists aim to demystify AI and manage expectations. These courses equip healthcare professionals with the knowledge to leverage AI as a valuable tool in their practice.
The AI revolution in medicine is poised to transform healthcare through foundation models that offer adaptability, enhanced diagnostic capabilities, and unprecedented insights. While challenges and the need for responsible usage persist, the integration of AI in clinical practice holds tremendous promise. Rather than replacing human expertise, AI will serve as a valuable partner in delivering better healthcare outcomes for patients worldwide. As the field of medical AI continues to evolve, its impact on healthcare will be profound and far-reaching.