pharma’s-trial-problem:-outdated-systems,-broken-data,-and-the-coming-ai-reset
Pharma’s Trial Problem: Outdated Systems, Broken Data, and the Coming AI Reset

Pharma’s Trial Problem: Outdated Systems, Broken Data, and the Coming AI Reset

Erik Terjesen
Erik Terjesen
Managing Director
Silicon Foundry, a Kearney Company

Clinical development has become the most resource-intensive stage of drug innovation. Across the industry, clinical trials consume 60–70% of total R&D spending, a proportion that continues to rise as trials grow more complex, more data-heavy, and more operationally demanding. The irony is that while science has advanced dramatically, the underlying model for running trials still reflects assumptions from a pre-digital era. The result is an ecosystem in which timelines stretch, costs multiply, and meaningful efficiency gains remain elusive.

AI has reached a level of maturity capable of reshaping this landscape, but its potential remains constrained by a fundamental issue the industry has been slow to confront. The data used to power these systems was never designed with AI in mind. In fact, the true crisis in clinical development today is structural and deeply rooted in how trial data is organized, contextualized, and interpreted.

Why trial models are failing

Clinical trials were built for physical sites, paper workflows, and slow-moving systems. Modern trials look nothing like that. They are distributed, data-heavy, biomarker-driven, and increasingly adaptive, yet they still run on infrastructure designed for a simpler era.

For years, clinical operations have been organized around sites and checklists rather than continuous insight. Data moves in bursts, workflows remain fragmented, and systems rarely talk to one another. Precision medicine expanded what trials could ask of data, but the way trials actually operate has barely evolved.

The problem isn’t only speed or scale. It’s also the quiet erosion of efficiency in places trial plans rarely account for. Across the industry, leaders describe a growing layer of “invisible waste”: repeated handoffs, duplicative manual work, incompatible data structures, and everyday operational friction that steadily stretches timelines and drives up costs, even though it seldom appears in formal project plans.

AI changes the equation, but only if trial data can support it.

Why AI stumbles in pharma

There is no shortage of AI talent, tools, or ambition in the life sciences sector. What is scarce is data that AI can meaningfully learn from. Most early AI-for-clinical-trials initiatives failed not because the models were immature, but because the data they were fed was not curated with clinical intent.

Two challenges define this crisis:

1. General-purpose models cannot interpret clinical nuance.

Models trained on large public corpora can identify patterns, but they lack clinical judgment. If the data is unstructured, inconsistently labeled, or lacks contextual metadata, the model will draw the wrong conclusions with absolute confidence. The well-known “ruler problem”—in which an AI system learned to detect malignant skin lesions based on the presence of a ruler beside the lesion—illustrates how easily models latch onto irrelevant signals.

2. Pharma’s internal data is both rich and unusable.

Organizations hold decades of trial data, but these assets are rarely AI-ready. Different study teams, CROs, and geographies used different standards. Biomarker and imaging data are often stored in systems that cannot communicate with EDC or safety platforms. And clinical notes, PDFs, and unstructured documents require interpretation that models cannot perform without curated training sets.

AI amplifies the quality of the data it is given. If the input is clinically inconsistent, overgeneralized, or disconnected from the trial context, the outputs will be clinically meaningless.

Recognizing this, many pharmas are now investing heavily in curated internal datasets, governance frameworks, and senior AI leadership, often in the form of newly created chief AI officer roles. These leaders are tasked with not just deploying tools, but rebuilding the data infrastructure from which future AI insights will emerge.

The new AI toolkit for clinical trials

Once the data foundation is strong, AI becomes a force multiplier across the entire trial lifecycle. Several categories show particularly high near-term impact potential.

Clinical-grade language models: Purpose-built models that ingest curated internal datasets can help draft protocols, refine eligibility criteria, flag operational risks, and interpret historical trial performance. Unlike general-purpose systems, these models are tuned to reason the way experienced clinical scientists do.

Multimodal AI for patient stratification and endpoint optimization: Integrating imaging, labs, digital biomarkers, and historical trial outcomes enables more precise cohort selection and improves the likelihood of detecting true therapeutic effect. These tools help convert today’s complex data streams into actionable insights.

Synthetic and hybrid control arms: While still emerging, these approaches reduce dependence on large traditional control cohorts by incorporating real-world evidence and model-generated comparators when appropriate. The result is faster recruitment and more efficient statistical design.

AI agents for operations: Operational agents can triage site queries, assist with eligibility adjudication, coordinate scheduling, and draft routine documentation. They are particularly helpful in reducing the administrative burden that slows trial execution.

The most underestimated category, and the one with the most long-term potential, is clinical-driven AI, where the model is trained to interpret clinical data the way a researcher with a PhD or a clinician would. This approach addresses the core issue of context, which is essential for decision-making in regulated environments.

From site-centric to data-centric trials

Trials are gradually evolving away from rigid site-based infrastructure and toward data-centric execution. AI accelerates this shift by enabling continuous monitoring, adaptive decision-making, and greater representation across diverse populations. The next phase of this transition requires progress in several areas:

  • Reliable digital biomarkers collected via wearables and sensors that feed directly into the trial data ecosystem.
  • Real-world evidence integration that allows trial designs to incorporate external data while maintaining regulatory rigor.
  • Improved cohort diversity, supported by AI-driven recruitment models that identify and engage underrepresented populations.
  • Always-on trial oversight, where adaptive protocols adjust based on real-time data rather than periodic interim reviews.

As these elements mature, trials will resemble dynamic learning systems rather than static sequences of predefined events.

Pharma cannot do this alone

The clinical-trial innovation ecosystem is now incredibly fragmented. A myriad of startups, many founded within the last five years, are attempting to solve different slices of the trial process. Some focus on recruitment; others on protocol simulation, operational automation, predictive enrollment, or digital biomarker analysis.

This fragmentation creates noise but also opportunity. The organizations that succeed will be those that adopt a hybrid strategy, in which internal data expertise is paired with carefully selected external partners. Evaluating early-stage companies requires disciplined technical assessment and an understanding of which partners can meet enterprise requirements in a regulated environment.

Pharma organizations also face a structural talent challenge. The best AI engineers often gravitate toward startups rather than large enterprises. This dynamic reinforces the need for partnership models that combine internal governance with external innovation rather than relying exclusively on one or the other.

What AI can (and cannot) fix

While AI can dramatically shorten timelines and improve decision-making, it is not a cure-all. It will not rescue a flawed trial design, replace human oversight, or eliminate the need for regulatory rigor. What it can do is accelerate the work around those elements, optimizing how protocols are developed, how patients are selected, how data is interpreted, and how milestones are achieved. The organizations that reap the greatest benefit will be those with disciplined data stewardship and a willingness to rethink long-held operational assumptions.

Erik Terjesen is the managing director at Silicon Foundry, a Kearney Company