Researchers trying to improve health care with artificial intelligence usually subject their algorithms to a form of machine med school. Software learns from doctors by digesting thousands or millions of x-rays or other data labeled by expert humans until it can accurately flag suspect moles or lungs showing signs of Covid-19 by itself.
A study published this month took a different approach—training algorithms to read knee x-rays for arthritis by using patients as the AI arbiters of truth instead of doctors. The results revealed that radiologists may have literal blind spots when it comes to reading Black patients’ x-rays.
The algorithms trained on patients’ reports did a better job than doctors at accounting for the pain experienced by Black patients, apparently by discovering patterns of disease in the images that humans usually overlook.
“This sends a signal to radiologists and other doctors that we may need to reevaluate our current strategies,” says Said Ibrahim, a professor at Weill Cornell Medicine, in New York City, who researches health inequalities, and who was not involved in the study.
Algorithms designed to reveal what doctors don’t see, instead of mimicking their knowledge, could make health care more equitable. In a commentary on the new study, Ibrahim suggested it could help reduce disparities in who gets surgery for arthritis. African American patients are about 40 percent less likely than others to receive a knee replacement, he says, even though they are at least as likely to suffer osteoarthritis. Differences in income and insurance likely play a part, but so could differences in diagnosis.
“The algorithm was seeing things over and above what the radiologists were seeing.”
Ziad Obermeyer, professor, UC Berkeley School of Public Health
Ziad Obermeyer, an author of the study and a professor at the University of California Berkeley’s School of Public Health, was inspired to use AI to probe what radiologists weren’t seeing by a medical puzzle. Data from a long-running National Institutes of Health study on knee osteoarthritis showed that Black patients and people with lower incomes reported more pain than other patients with x-rays radiologists scored as similar. The differences might stem from physical factors unknown to keepers of knee knowledge, or psychological and social differences—but how to tease those apart?
Obermeyer and researchers from Stanford, Harvard, and the University of Chicago created computer vision software using the NIH data to investigate what human doctors might be missing. They programmed algorithms to predict a patient’s pain level from an x-ray. Over tens of thousands of images, the software discovered patterns of pixels that correlate with pain.
When given an x-ray it hasn’t seen before, the software uses those patterns to predict the pain a patient would report experiencing. Those predictions correlated more closely with patients’ pain than the scores radiologists assigned to knee x-rays, particularly for Black patients. That suggests the algorithms had learned to detect evidence of disease that radiologists didn’t. “The algorithm was seeing things over and above what the radiologists were seeing—things that are more commonly causes of pain in Black patients,” Obermeyer says.
The WIRED Guide to Artificial Intelligence
Supersmart algorithms won’t take all the jobs, But they are learning faster than ever, doing everything from medical diagnostics to serving up ads.
History may explain why radiologists aren’t as proficient in assessing knee pain in Black patients. The standard grading used today originated in a small 1957 study in a northern England mill town with a less diverse population than the modern US. Doctors used what they saw to devise a way to grade the severity of osteoarthritis based on observations such as narrowed cartilage. X-ray equipment, lifestyles, and many other factors have changed a lot since. “It’s not surprising that that fails to capture what doctors see in the clinic today,” Obermeyer says.
The study is notable not just for showing what happens when AI is trained by patient feedback instead of expert opinions, but because medical algorithms have more often been seen as a cause of bias, not a cure. In 2019, Obermeyer and collaborators showed that an algorithm guiding care for millions of US patients gave white people priority over Black people for assistance with complex conditions such as diabetes.