Researchers have developed CytoDiffusion, a groundbreaking AI tool that analyzes blood cells with unprecedented accuracy, outperforming human experts in identifying abnormalities. This innovation presents a significant leap forward in diagnosing conditions such as leukemia, offering hope for more reliable and efficient medical assessments.
A new artificial intelligence tool called CytoDiffusion is set to transform the diagnostic landscape for blood disorders, surpassing human capabilities in identifying abnormalities with exceptional accuracy.
Developed by researchers from the University of Cambridge, University College London (UCL) and Queen Mary University of London, CytoDiffusion utilizes generative AI technology — akin to that behind image generators like DALL-E — to meticulously analyze the shape and structure of blood cells. Their findings are published in the journal Nature Machine Intelligence.
Detecting subtle differences in blood cell size, shape and appearance is inherently critical to diagnosing many blood disorders. The task, however, necessitates years of training and different doctors can still disagree on difficult cases.
Blood smears consist of thousands of cells, making comprehensive human analysis impractical.
“Humans can’t look at all the cells in a smear – it’s just not possible,” first author Simon Deltadahl, a doctoral student in Cambridge’s Department of Applied Mathematics and Theoretical Physics, said in a news release. “Our model can automate that process, triage the routine cases, and highlight anything unusual for human review.”
This innovation addresses a significant bottleneck in hematology.
“The clinical challenge I faced as a junior haematology doctor was that after a day of work, I would have a lot of blood films to analyse,” added co-senior author Suthesh Sivapalaratnam, a clinical senior lecturer at Queen Mary University of London. “As I was analysing them in the late hours, I became convinced AI would do a better job than me.”
The development of CytoDiffusion entailed training the AI on over half a million images from blood smears at Addenbrooke’s Hospital in Cambridge, forming the largest dataset of its kind. This extensive dataset enabled the AI to recognize not just common blood cell types but also rare and unusual cells indicative of disease.
By focusing on the full distribution of cell appearances, CytoDiffusion demonstrated robustness against variations in hospital equipment, microscopes and staining methods. As a result, it exhibited greater sensitivity in detecting abnormal cells associated with leukemia, coming ahead of existing systems even with fewer training examples.
“When we tested its accuracy, the system was slightly better than humans,” Deltadahl added. “But where it really stood out was in knowing when it was uncertain. Our model would never say it was certain and then be wrong, but that is something that humans sometimes do.”
The AI also excelled in generating synthetic blood cell images that were indistinguishable from real ones. In a “Turing test” with 10 seasoned hematologists, the experts could not effectively differentiate between the actual and AI-generated images.
“That really surprised me,” added Deltadahl. “These are people who stare at blood cells all day, and even they couldn’t tell.”
The researchers plan to release the world’s largest publicly available dataset of peripheral blood smear images, enhancing global research and democratizing access to high-quality medical data.
“By making this resource open, we hope to empower researchers worldwide to build and test new AI models, democratise access to high-quality medical data, and ultimately contribute to better patient care,” Deltadahl added.
While promising, CytoDiffusion is intended to complement rather than replace trained clinicians. It will expedite the review of routine cases and flag anomalies for closer inspection by specialists.
“The true value of healthcare AI lies not in approximating human expertise at lower cost, but in enabling greater diagnostic, prognostic, and prescriptive power than either experts or simple statistical models can achieve,” added co-senior author Parashkev Nachev, a UCL professor of neurology.
The team emphasizes the need for further work to enhance the system’s speed and to validate its effectiveness across diverse patient populations to ensure fairness and accuracy.
Source: University of Cambridge

