Arc Institute Launches Virtual Cell Challenge to Accelerate AI Model Development

Artificial Intelligence in Healthcare, AI Health, digital healthcare provider, telemedicine, medical technology — Credit: Just_Super/Getty Images

Yusuf Roohani, PhD, machine learning group lead at the Arc Institute, is among a team of researchers training artificial intelligence (AI) models with transcriptome data to predict how cell gene expression patterns change with different cell states. These so-called virtual cells could help researchers discover new drugs capable of shifting cells from “diseased” to “healthy” with fewer off-target effects to boost clinical success rates.

However, building a virtual cell is not an easy feat.

“When you look at cells, they are living dynamic systems,” said Roohani in an interview with GEN Edge. “Cells are constantly in flux, they’re messy, and they’re dependent on the experiment.”

Virtual cell models must account for biological complexity, such as the cell type, genetic background, and cell context. In addition, many existing single-cell datasets are impacted by substantial technical noise, including limited reproducibility of perturbation effects across independent experiments, which diminishes model performance.

Without standardized benchmarks and purpose-built datasets, the field has struggled to evaluate whether virtual cell models are capturing generalizable biological insights and not dataset-specific patterns.

In a step toward virtual cell benchmarking and acceleration, the Arc Institute has announced the inaugural “Virtual Cell Challenge,” a public competition, sponsored by Nvidia, 10x Genomics, and Ultima Genomics, with a grand prize worth $100,000 for the machine learning model that best predicts how cells will respond to genetic perturbations. The challenge is described in a new commentary published in Cell with Roohani as lead author.

The initiative follows in the footsteps of the Critical Assessment of protein Structure Prediction (CASP) competition, the biannual experiment that assesses the latest state-of-the-art models in structural biology. Patrick Hsu, PhD, co-founder and core investigator at Arc, emphasized that CASP competitions have transformed protein structure prediction over 25 years and ultimately enabled breakthroughs, such as the Nobel Prize-winning algorithm, AlphaFold.

“We believe Arc can use the same approach to accelerate progress toward comprehensive virtual cells that could fundamentally change how we study biology and identify targets to better treat complex diseases,” said Hsu in a public release.

Emma Lundberg, PhD, associate professor at Stanford University and the co-director of the Human Protein Atlas, which is based at the KTH Royal Institute of Technology in Stockholm, concurs that establishing benchmarks has been a key challenge for evaluating and comparing virtual cell models.

“I expect that [Arc’s] challenge will help to align the community and accelerate the work toward performant and useful virtual cell models. Hopefully, it’s the first of many standardized challenges in this space,” she told GEN Edge.

Theofanis Karaletsos, senior director of AI at Chan Zuckerberg Initiative (CZI), is an active developer of the virtual cell who has pushed forward CZI’s recent models, such as scGenePT for single-cell perturbations, and TranscriptFormer for cross-species predictions.

“At CZI, we’re focused on building cutting-edge models and providing standardized evaluation frameworks to deepen the scientific community’s understanding of cells,” Karaletsos told GEN Edge. “Community benchmarks are important, and we believe open competitions like Arc’s are a powerful mechanism to accelerate innovation and collective progress.”

A Palo Alto-based non-profit research organization, the Arc Institute was founded in 2021 by Hsu and Silvana Konermann, PhD, assistant professor of biochemistry at Stanford University and Arc’s current executive director. Since that time, the institute has been known to make big bets on data-driven AI. In collaboration with Nvidia earlier this year, Arc released what they described at the time as the largest publicly available AI model for biology, Evo 2.

New context

As a key challenge for AI models is making predictions outside of the training data, the Arc competition will evaluate how well competing virtual cells can predict changes in gene activity when generalizing to a new cell context.

For the inaugural competition, Arc has generated a new single-cell transcriptomics dataset of 300,000 H1 human embryonic stem cells (H1 hESCs) with 300 genetic perturbations, which will be deployed throughout the competition in segments for fine-tuning, validation, and testing.

Models will be evaluated on the following three metrics: 1) performance in predicting differentially expressed genes; 2) performance discriminating between different perturbation effects; and 3) general error in terms of deviation from expression counts.

The interim performance of competitor models will be shared on a live leaderboard during the middle phase of the competition. The three teams with the top models will receive prizes valued at $100,000, $50,000, and $25,000, combining cash awards and NVIDIA DGX Cloud credits.

Registration for the competition is now open. Individual contributors as well as teams from academic institutions, biotechnology companies, and independent research organizations are eligible to participate. Final rankings will be determined solely by model performance on the final test set, which will be released in late October, one week prior to the final submission deadline. Winners will be announced in December.

Current STATE

As a baseline, Virtual Cell Challenge competitors will initially go head-to-head with Arc’s first virtual cell model, STATE, which is designed to predict how various stem cells, cancer cells, and immune cells respond to drugs, cytokines, or genetic perturbations. STATE was released earlier this week for non-commercial use and is described in a preprint posted on Arc’s website that has not yet been peer reviewed.

According to the authors, STATE improved discrimination of perturbation effects on multiple large datasets by over 50% and identified differentially expressed genes across genetic, signaling, and chemical perturbations with over 2-fold accuracy compared to existing models.

To promote flexibility and scalability, STATE is composed of two interlocking modules, known as the State Transition model (ST) and State Embedding model (SE).

ST learns perturbation effects using data from over 100 million perturbed cells across 70 contexts. In contrast to existing models, which focus on making predictions for a single cell at a time, ST leverages a distinct bi-directional transformer architecture to make predictions for entire cell collections. The approach offers an advance by allowing the flexible capture of biological and technical heterogeneity without relying on explicit assumptions about the distribution.

SE is trained on observational single-cell data from 167 million human cells to learn gene expression variation between cells across diverse datasets. The module provides representations that are optimized for detecting biological perturbations and robust to technical noise to allow STATE to be effectively trained with multiple large datasets.

STATE is a transformer-based model for predicting perturbation effects across sets of cells [Arc Institute] — STATE is a transformer-based model for predicting perturbation effects across sets of cells. [Arc Institute]

Data-bound progress

Virtual Cell Challenge competitors are invited to train models on gene expression from public databases, including over half a billion cells included in the Arc Virtual Cell Atlas, which is composed of large single-cell datasets, scBaseCount, and Tahoe-100M.

Fabian Theis, PhD, director of the Institute for Computational Biology at Helmholtz Munich, is a renowned researcher working to predict genetic and chemical perturbation on the cellular level. He says improving data scale and quality has been key to pushing the field forward.

“I am excited about the upcoming perturbation prediction challenge by Arc,” Theis told GEN Edge. “Data scale has only recently been expanding sufficiently to allow complex generative AI models to outperform simpler linear models. It will be exciting to see true out-of-distribution behavior of various model types evaluated on new data.”

Theis’s lab group is known as the developer of CellFlow, a framework based on flow matching, a generative modeling approach, that can simulate single cell phenotypes induced by complex perturbations. Additionally, Theis is a scientific advisor to Open Problems, a scientific group that has hosted related challenges for benchmarking various single-cell analysis methods.

Additional datasets that are fair game for Virtual Cell Challenge model training include X-Atlas/Orion, the largest publicly available Perturb-seq dataset released last week by AI drug discovery unicorn, Xaira Therapeutics. The dataset offers the advantage of measuring dose-dependent genetic effects for therapeutic applications, such as defining the precise percent inhibition at which a drug target produces a desired effect.

Ci Chu, PhD, vice president of early discovery at Xaira, agrees that CASP has set a nice precedent for benchmarking in protein structure prediction.

“It’s exciting to see the Arc team apply the same spirit to the virtual cell community,” Chu told GEN Edge. “The field’s progress is ultimately data-bound. The more high-quality, public data the community has to build on, the better—which is exactly why we released X-Atlas/Orion as well.”

Xaira is currently building its own virtual cell models with AI expert, Bo Wang, PhD, SVP and head of biomedical AI, who joined the team in April. Hailing from the University of Toronto, Wang is known as the inventor of scGPT, a foundation model for single-cell multi-omics with downstream capabilities, including cell type annotation, perturbation response prediction and gene network inference.

As researchers push forward the next generation of AI models to make a mark on the Virtual Cell Challenge leaderboard, the field will watch whether new therapeutic advances will follow suit. Let the challenge begin.