New research from Stevens Institute of Technology finds that artificial intelligence can sharpen doctors’ breast cancer image diagnoses — but only when its explanations are designed to support, not overload, clinicians.
Artificial intelligence is already helping doctors spot cancer earlier and more accurately. But new research from Stevens Institute of Technology suggests that how AI explains its decisions can make the difference between life-saving support and dangerous distraction.
In two recent studies focused on breast cancer imaging, Stevens researchers found that AI tools can improve how accurately oncologists and radiologists read medical images. At the same time, they discovered that piling on extra explanations about how the AI reached its conclusions can actually slow clinicians down, increase their mental workload and, in some cases, make them more likely to make mistakes.
The work zeroes in on a central tension in modern medicine: how to harness the power of AI while keeping human experts firmly in control.
AI systems are already widely used to scan X-rays, MRIs and CT images for subtle patterns that can be hard for humans to see. With access to large medical datasets and powerful computing, these systems can sift through data at a speed no person can match.
“AI systems can process thousands of images quickly and provide predictions much faster than human reviewers,” senior author Onur Asan, an associate professor at Stevens whose research focuses on how people interact with technology in health care, said in a news release. “Unlike humans, AI does not get tired or lose focus over time.”
Yet many clinicians remain wary of relying on AI. A major reason is that many systems function as a so-called “black box,” offering a prediction or risk score without a clear explanation of how they got there.
“When clinicians don’t know how AI generates its predictions, they are less likely to trust it,” added Asan. “So we wanted to find out whether providing extra explanations may help clinicians, and how different degrees of AI explainability influence diagnostic accuracy, as well as trust in the system.”
To explore that question, Asan worked with doctoral student Olya Rezaeian at Stevens and assistant professor Alparslan Emrah Bayrak at Lehigh University. The team studied 28 oncologists and radiologists as they used an AI system to analyze breast cancer images.
All of the clinicians saw AI-generated assessments of the images. Some also received additional layers of explanation about how the AI arrived at its conclusions. After reviewing the images, participants answered questions about how confident they were in the AI’s assessment and how difficult they found the task.
The researchers reported that clinicians who used AI were more accurate overall than those in a control group who did not have AI support. But the benefits came with important conditions.
One key finding: more explanation was not always better.
“We found that more explainability doesn’t equal more trust,” Asan added.
The team observed that when explanations became more detailed or complex, clinicians had to spend more time processing that information. That extra effort pulled attention away from the images themselves and slowed decision-making, which in turn hurt overall performance.
“Processing more information adds more cognitive workload to clinicians. It also makes them more likely to make mistakes and possibly harm the patient,” added Asan. “You don’t want to add cognitive load to the users by adding more tasks.”
The studies also flagged a different kind of risk: overconfidence in AI. In some cases, clinicians trusted the system’s output so strongly that they were less likely to question it, even when it was wrong.
“If an AI system is not designed well and makes some errors while users have high confidence in it, some clinicians may develop a blind trust believing that whatever the AI is suggesting is true, and not scrutinize the results enough,” Asan added.
The team’s findings are detailed in two papers: one on the impact of AI explanations on trust and diagnostic accuracy in breast cancer, published in Applied Ergonomics, and a second on explainability and AI confidence in clinical decision support systems, published in the International Journal of Human–Computer Interaction.
Together, the studies point to a design challenge for the next generation of medical AI: build tools that are transparent enough to be trustworthy, but simple enough to be usable in the high-pressure environment of clinical care.
“Our findings suggest that designers should exercise caution when building explanations into the AI systems,” Asan said.
He added that explanations should be crafted so they support clinicians’ thinking instead of overwhelming them.
Training will also be critical. Asan emphasized that AI should be seen as an assistant, not a replacement, for human expertise.
“Clinicians who use AI should receive training that emphasizes interpreting the AI outputs and not just trusting it,” he said.
Beyond explainability, Asan pointed to a broader principle from technology adoption research: people are most likely to use a tool when they believe it is both helpful and easy to use.
“Research finds that there are two main parameters for a person to use any form of technology — perceived usefulness and perceived ease of use,” he added. “So if doctors will think that this tool is useful for doing their job, and it’s easy to use, they are going to use it.”
For patients, the stakes are high. Breast cancer remains one of the most common cancers worldwide, and early, accurate detection can dramatically improve outcomes. AI has the potential to catch tumors earlier, reduce missed diagnoses and help standardize care across hospitals and clinics.
But the Stevens research underscores that simply adding AI to the reading room is not enough. To truly improve care, systems must be designed with clinicians’ cognitive limits and workflows in mind, and health systems must invest in training that helps doctors understand when to lean on AI and when to question it.
As AI continues to spread through medicine, studies like these offer a roadmap: build tools that amplify human judgment, avoid overloading already stretched clinicians and keep patient safety at the center of every design choice.
Source: Stevens Institute of Technology

