News

AI cancer tools risk “shortcut learning” rather than detecting true biology

3 Mar 2026
AI cancer tools risk “shortcut learning” rather than detecting true biology

New research warns that popular deep learning systems trained for cancer pathology may be relying on hidden shortcuts rather than genuine biological signals.

Artificial intelligence tools are increasingly being developed to predict cancer biology directly from microscope images, promising faster diagnoses, and cheaper testing.

But new research from the University of Warwick, published in Nature Biomedical Engineering, suggests that many of these systems may be using visual shortcuts rather than true biology — raising concerns that some AI pathology tools are currently too unreliable for real-world patient care.

To reach this conclusion, the researchers analysed more than 8,000 patient samples across four major cancer types — breast, colorectal, lung and endometrial — and compared the performance of leading machine learning approaches.

While the models often achieved high headline accuracy, the team found this frequently came from statistical “shortcuts.”

For example, instead of detecting mutations in the cancer-associated BRAF gene, a model might learn that BRAF mutations often occur alongside another clinical feature such as microsatellite instability (MSI).

The system then learns to use this combination of cues to predict BRAF status rather than learning the causal BRAF signal itself - meaning accurate cancer predictions work only when these biomarkers co-occur and become unreliable when they do not.

When performance of AI models was assessed within stratified patient subgroups, such as only high-grade breast cancers or only MSI-positive tumours, accuracy fell substantially, revealing that the models were dependent on shortcut signals that disappear once confounding factors are controlled.

For certain prediction tasks, the performance advantage of deep learning over human-derived clinical information was modest.

AI systems achieved accuracy scores of just over 80% when predicting biomarkers, compared with around 75% using tumour grade alone — a measure already assessed by pathologists.

Machine learning methods can still prove valuable for research, drug development candidate screening and for clinical triaging, screening, or supplementary decision support.

However, the researchers argue that future AI tools must move beyond correlation-based learning and adopt approaches that explicitly model biological relationships and causal structure.

They also call for stronger evaluation standards, including subgroup testing and comparison against simple clinical baselines, before looking at deployment in routine care.

“While progress often requires imperfect first steps, we should learn from the past and avoid oversimplification or overreach through inappropriate concepts. Complexity and variability are central challenges — but they are also exactly what these novel technologies must learn to embrace.”

Source: University of Warwick