Every patient, irrespective of their physical characteristics and identities, should have access to good healthcare. However, certain people or groups have often been deprived the fairness in the healthcare system due to the issue of inequities and implicit biases in medical treatment and diagnosis.
AI Models in Healthcare are Biased
A group of researchers from the Massachusetts Institute of Technology (MIT) have now found that artificial intelligence and machine learning have the tendency to further stir healthcare disparities and inequities among subgroups, often underrepresented. This can affect how these groups are diagnosed and treated.
Led by an assistant professor in MIT’s Department of Electrical Science and Engineering (EECS), Marzyeh Ghassemi, the researchers released a research paper analyzing the roots of the disparities that can arise in AI, causing the models that perform well overall to falter when it involves underrepresented subgroups.
The analysis focused on “subpopulation shifts,” which the report defined as the “differences in the way machine learning models perform for one subgroup as compared to another.” The primary objective was to determine the kinds of subpopulation shifts that can occur with AI techniques and potentially inform future advancements for more equitable models.
“We want the models to be fair and work equally well for all groups, but instead, we consistently observe the presence of shifts among different groups that can lead to inferior medical diagnosis and treatment,” says MIT Ph.D. student Yuzhe Yang.
Researchers Identify 4 Shifts That Stir Biases in AI Models
The MIT researchers identified four types of shifts – spurious correlations, attribute imbalance, class imbalance, and attribute generalization – which causes inequities and biases in AI techniques.
“Biases can, in fact, stem from what the researchers call the class, or from the attribute, or both,” the report reads.
The researchers gave an instance where machine learning models were used to determine whether a person has pneumonia or not based on an examination of X-ray images, with two attributes – the people getting X-rayed are either female or male, and two classes – one consisting of people who have the lung ailment, and another infection-free.
“If, in this particular dataset, there were 100 males diagnosed with pneumonia for every female diagnosed with pneumonia, that could lead to an attribute imbalance, and the model would likely do a better job of correctly detecting pneumonia for a man than for a woman,” the team explained.
Can AI Models Work in an Unbiased Manner?
The MIT researchers said they were able to reduce the occurrence of spurious correlations, class imbalance, and attribute imbalance by improving the “classifier” and “encoder.” However, the other shift, “attribute generalization,” persisted.
“No matter what we did to the encoder or classifier, we did not see any improvements in terms of attribute generalization,” Yang says, “and we don’t yet know how to address that.”
The team is currently exploring public datasets for tens of thousands of patients and chest X-rays to determine whether healthcare professionals can achieve fairness in medical diagnosis and treatment in machine learning models.
However, they acknowledged that there is still a need for a better understanding of the sources of unfairness and how they permeate our current system to arrive at the equitability desired.