Stanford AI Model Reads Sleep Data to Forecast Disease Risk

A new Stanford-developed AI model, SleepFM, turns a single night of lab-recorded sleep into a powerful early warning system for future disease. By learning the “language” of sleep, it can forecast risks for conditions ranging from heart disease to cancer.

A single night in a sleep lab might someday offer more than a diagnosis of snoring or insomnia. It could serve as an early warning system for serious diseases years before symptoms appear.

Stanford Medicine researchers and collaborators have developed an artificial intelligence model, called SleepFM, that can analyze one night of detailed sleep recordings and predict a person’s risk of developing more than 100 different health conditions.

The work, published in the journal Nature Medicine, taps into a resource sleep doctors have been collecting for decades but only partially using: polysomnography, the gold-standard overnight test that tracks brain waves, heart rhythms, breathing, eye movements, muscle activity and more.

Polysomnography is already central to diagnosing sleep apnea and other sleep disorders. But each study generates hours of rich, continuous physiological data that mostly goes unused once a report is written.

“We record an amazing number of signals when we study sleep,” co-senior author Emmanuel Mignot, the Craig Reynolds Professor in Sleep Medicine at Stanford, said in a news release. “It’s a kind of general physiology that we study for eight hours in a subject who’s completely captive. It’s very data rich.”

SleepFM is designed to unlock that data.

Built as a “foundation model” — the same broad AI category that includes large language models — SleepFM was trained on nearly 600,000 hours of polysomnography data from about 65,000 people who underwent sleep studies at clinics. The largest group came from the Stanford Sleep Medicine Center, which has been collecting sleep recordings since the 1970s and pairing them with long-term health records.

From an AI standpoint, sleep has been an overlooked frontier.

“From an AI perspective, sleep is relatively understudied. There’s a lot of other AI work that’s looking at pathology or cardiology, but relatively little looking at sleep, despite sleep being such an important part of life,” added co-senior author James Zou, an associate professor of biomedical data science at Stanford.

To train SleepFM, the team chopped each overnight study into five-second slices, similar to how language models learn from individual words and phrases. Each slice included multiple data streams, such as brain activity (electroencephalography), heart activity (electrocardiography), muscle activity (electromyography), breathing airflow and pulse.

“SleepFM is essentially learning the language of sleep,” Zou added.

The researchers also developed a new training strategy, called leave-one-out contrastive learning. In simple terms, they would hide one type of signal — for example, the heart rhythm — and challenge the model to reconstruct it using the remaining signals. That forced the AI to understand how different parts of the body’s nighttime physiology relate to each other.

“One of the technical advances that we made in this work is to figure out how to harmonize all these different data modalities so they can come together to learn the same language,” added Zou.

After this broad training phase, the team fine-tuned SleepFM for specific tasks.

First, they asked it to do what existing sleep AI tools already attempt: classify sleep stages and assess the severity of sleep apnea. On those benchmarks, SleepFM matched or outperformed current state-of-the-art models.

Then the researchers pushed further, testing whether the patterns hidden in one night of sleep could predict future disease.

Because the Stanford Sleep Medicine Center has decades of electronic health records linked to its sleep studies, the team could see which patients later developed various conditions. That allowed them to ask: Did anything in the sleep data foreshadow those outcomes?

SleepFM analyzed more than 1,000 categories of disease and found 130 that it could predict with reasonable accuracy based on the original sleep recordings. The model’s forecasts were especially strong for cancers, pregnancy complications, circulatory diseases and mental health disorders.

To measure performance, the researchers used a standard metric called the concordance index, or C-index, which captures how well a model can rank who is likely to get sick sooner.

“For all possible pairs of individuals, the model gives a ranking of who’s more likely to experience an event — a heart attack, for instance — earlier. A C-index of 0.8 means that 80% of the time, the model’s prediction is concordant with what actually happened,” Zou added.

SleepFM reached or exceeded that level for several serious conditions, including Parkinson’s disease, dementia, hypertensive heart disease, heart attack, prostate cancer, breast cancer and overall mortality.

“We were pleasantly surprised that for a pretty diverse set of conditions, the model is able to make informative predictions,” added Zou.

Models with somewhat lower C-index scores are already used in oncology and other specialties to guide treatment decisions, suggesting that SleepFM’s performance could be clinically meaningful if validated and deployed carefully.

The study also offers a glimpse into how AI might change preventive medicine. If a routine sleep study could flag elevated risk for certain cancers or cardiovascular problems years in advance, doctors might recommend more frequent screening, lifestyle changes or closer monitoring long before disease takes hold.

For now, SleepFM is a research tool, not something patients will encounter at their next sleep clinic visit. The team is working to refine its predictions and make the model more interpretable.

“It doesn’t explain that to us in English,” Zou added. “But we have developed different interpretation techniques to figure out what the model is looking at when it’s making a specific disease prediction.”

Those techniques suggest that while certain signals matter more for certain diseases — heart rhythms for heart disease, brain waves for mental disorders — the model performs best when it can compare multiple channels at once.

“The most information we got for predicting disease was by contrasting the different channels,” added Mignot.

In other words, trouble may show up when different parts of the body fall out of sync during sleep — for example, when brain activity suggests deep rest but the heart looks unusually active.

Looking ahead, the researchers hope to improve SleepFM by incorporating data from consumer wearables, which are far less detailed than polysomnography but much easier to collect at scale. They also see potential for adapting the model to different populations and health systems, and for exploring how changes in a person’s sleep patterns over time relate to disease risk.

The project brought together scientists from Stanford, the Technical University of Denmark, Copenhagen University Hospital – Rigshospitalet, BioSerenity, the University of Copenhagen and Harvard Medical School.

For students and early-career researchers, SleepFM is a case study in how long-standing clinical practices can be reimagined with modern AI. A test designed to diagnose sleep disorders may also hold clues to cancer, heart disease and brain health — if we learn how to read the signals.

Source: Stanford Medicine