New AI Model Predicts Disease Risk Decades in Advance

Researchers have unveiled a pioneering AI model capable of predicting disease risks decades in advance by analyzing large-scale health records. This innovation could revolutionize health care by enabling earlier interventions and more personalized treatment plans.

In a new study published in the journal Nature, researchers from the European Molecular Biology Laboratory (EMBL), the German Cancer Research Centre (DKFZ) and the University of Copenhagen have unveiled a pioneering AI model capable of predicting the risk and timing of over 1,000 diseases over a decade in advance.

This new generative AI model employs algorithmic principles akin to those used in large language models (LLMs).

It was trained on anonymized health data from 400,000 participants in the UK Biobank and further validated using data from 1.9 million patients in the Danish National Patient Registry. According to the researchers, this represents one of the most comprehensive demonstrations of generative AI’s potential to model human disease progression at scale.

“Our AI model is a proof of concept, showing that it’s possible for AI to learn many of our long-term health patterns and use this information to generate meaningful predictions,” Ewan Birney, interim executive director at EMBL, said in a news release. “By modeling how illnesses develop over time, we can start to explore when certain risks emerge and how best to plan early interventions. It’s a big step towards more personalized and preventive approaches to healthcare.”

Predicting Future Health Outcomes

Similar to how large language models can grasp sentence structures, this AI model deciphers the “grammar” of health data to envisage medical histories as sequences of events over time. These events include medical diagnoses and lifestyle choices like smoking habits. The model learns to predict disease risk based on the sequence and timing of such events.

“Medical events often follow predictable patterns,” added Tom Fitzgerald, staff scientist at EMBL’s European Bioinformatics Institute (EMBL-EBI). “Our AI model learns those patterns and can forecast future health outcomes. It gives us a way to explore what might happen based on a person’s medical history and other key factors. Crucially, this is not a certainty, but an estimate of the potential risks.”

The model excels for conditions with clear progression patterns, such as certain cancers, heart attacks and septicemia, but is less reliable for more variable conditions like mental health disorders or pregnancy-related complications.

Its Use and Limitations

Much like weather forecasts, the AI model provides probabilities rather than certainties. For instance, it could estimate a person’s risk of developing heart disease within a specified period, expressed as statistical rates over time — similar to forecasting a 70% chance of rain.

Short-term forecasts tend to be more accurate than long-range predictions.

For example, the model suggests varying heart attack risks among men in the UK Biobank cohort aged 60-65, with probabilities ranging from 4 in 10,000 annually to 1 in 100, influenced by prior diagnoses and lifestyle. While women generally have a lower risk, their risk distribution mirrors that of men.

It’s important to note that these forecasts align well with actual observed cases in different demographic groups from the UK Biobank.

The model is designed to generate accurate population-level risk estimates but has limitations. Since it was primarily trained on individuals aged 40–60, childhood and adolescent health events are underrepresented.

Additionally, demographic biases exist due to gaps in training data, including underrepresentation of certain ethnic groups.

While the AI model isn’t yet ready for clinical use, it holds promise for researchers to understand disease progression, explore how lifestyle and past illnesses influence long-term risks, and simulate health outcomes where real-world data are limited or inaccessible.

“This is the beginning of a new way to understand human health and disease progression,” added Moritz Gerstung, head of the Division of AI in Oncology at DKFZ and former group leader at EMBL-EBI. “Generative models such as ours could one day help personalize care and anticipate healthcare needs at scale. By learning from large populations, these models offer a powerful lens into how diseases unfold and could eventually support earlier, more tailored interventions.”

Source: European Molecular Biology Laboratory