Phesi, a global provider of patient-centric data science solutions for clinical development, has conducted a global analysis of more than 600,000 clinical trial protocols ahead of ASCO 2026. The analysis finds that fewer than one in three (29.3 %) protocols are linked to publicly documented patient data and outcomes. This trend is consistent across oncology trials, where just 30.9 % of 116,746 protocols are linked to usable patient data. With historical protocols often used as the basis for future trial design, the findings highlight a significant risk of flawed decision-making in clinical development.
As part of its analysis, Phesi conducted further analysis into breast cancer, the world’s most studied disease over the past five years. Despite the scale and maturity of research in breast cancer, just 31.2 % of 15,977 protocols are linked to trials with usable patient data. The findings show that even in the most data-rich disease area, high volumes of research do not automatically create the reliable, outcome-driven evidence base needed for future trial design or AI models.
This is a systemic issue in drug development. It is standard practice to design protocols based on existing, similar protocols, but when those templates are disconnected from the target patient population, they contribute to amendments and recruitment challenges. In the past, protocol writers could be selective about the protocols they used and apply human judgment to the connection between design and the target patient population. AI can ingest far more historical templates, but without the right logic or judgment it may fail to make that connection. A clinical protocol is effectively a business plan for an investment of tens or hundreds of millions of dollars, so AI must be guided by the right data foundations. In essence, flaws are being scaled, not solved."
Dr. Gen Li, President and Founder, Phesi
The reasons why protocols are not linked to patient data are complex. Some trials fail to recruit or complete because design issues restrict the ability to enroll, while others may enroll patients and collect data that never sees the light of day, despite legal reporting requirements. This creates a gap between protocol intent and what happens in patients, making it harder to understand which designs work, which fail and why. Without this visibility, sponsors risk reusing protocol designs that have already contributed to recruitment challenges, amendments or failed studies. In severe cases, failure of a development program can lead to sponsor financial distress or even bankruptcy.
“The gap between what protocols are designed to do and what actually happens in patients is the missing link in both current clinical development processes and emerging AI approaches,” commented Dr Li. “Datasets must account for the full patient population, not just narrow subsets from late-phase trials or large protocol datasets disconnected from patient treatment outcomes. Similarly, patient population data extracted from electronic health records may or may not align with the patient population targeted by a protocol. Only curated and contextualized data enables sponsors to identify meaningful oncology subpopulations and surface risk factors. There is huge opportunity for AI to optimize clinical development, but only when the platforms being used as the basis for AI can identify protocols with patients and outcomes reported. At Phesi, we are focused on connecting real-world and clinical patient data with trial execution to support precise study design and decision-making.”
Phesi's Trial Accelerator is built on this foundation, drawing on contextualized data from 375 million patients in 232 countries and territories, produced from 719,183 clinical research and clinical development projects guided by protocols and other study plans. The platform draws from 22 registries around the world, with additional sources including clinical trials, observational studies and real-world datasets.
The full report is available here.
Phesi will be attending ASCO 2026 and can be found at booth #31094.