key-genetic-variants-that-influence-disease-risk,-human-health-traits-identified
Key Genetic Variants That Influence Disease Risk, Human Health Traits Identified

Key Genetic Variants That Influence Disease Risk, Human Health Traits Identified

A study by researchers at The Jackson Laboratory (JAX), the Broad Institute, and Yale University has identified how specific genetic changes function in cells to influence disease risk and other human health traits. By probing regions of DNA previously linked to disease, the scientists created high resolution maps of DNA variant activity, helping pinpoint the exact changes that shape blood pressure, cholesterol levels, blood sugar and other complex human traits.

The study takes on a long-standing challenge in human genetics. Scientists have known for years that certain regions of the genome—often spanning tens of thousands to millions of DNA letters—are associated with diseases. But because these regions usually contain many variants that could potentially drive those associations, performing the necessary experiments to pinpoint which specific DNA changes truly matter has been difficult and time-consuming.

The solution was scale. Using a massively parallel reporter assay (MRPA)—which is a high-throughput approach that simultaneously evaluates the regulatory activity of thousands of DNA sequences—the team tested more than 220,000 previously identified DNA changes in five different cell types. By doing so, they resolved about 20% of these regions across the genome, revealing new insights into what these variants do, which in turn could help improve risk prediction and guide the development of new therapies.

Geneticist Ryan Tewhey, PhD, an associate professor who led the team at JAX, explained that previously making these connections was like searching for a single typo on one page of a massive book. The researchers’ new experimental approach is more akin to speed reading, scanning thousands of pages at once and flagging the exact letters that change meaning, dramatically accelerating genetic discovery.

“For nearly two decades, genetic studies have identified where in the genome we need to look for disease risk, but not which specific DNA changes are responsible,” Tewhey said. “Our study helps close this gap by working at the scale needed to confidently pinpoint the specific DNA changes that matter across thousands of regions all at once, rather than one by one.”

“What excites me is that this is a bridge from association to biology,” added Layla Siraj, MD, PhD, first author of the team’s published report in Nature (“Functional dissection of complex trait variants at single-nucleotide resolution”). Siraj spearheaded the study while in the Lander Lab at the Broad Institute, and is now in her residency in obstetrics and gynecology at Columbia University/New York Presbyterian. “By uncovering the patterns underlying how single-letter changes affect gene regulation, we can start mechanistically connecting genetic risk to the pathways therapies could target,” Siraj added. In their paper the team concluded, “Overall, our study provides a systematic functional characterization of likely causal common variants that underlie complex and molecular human traits, enabling new insights into the regulatory grammar underlying disease risk.”

In addition to Tewhey and Siraj, the newly published study in Nature was co-led by Jacob Ulirsch, PhD, currently a group leader at Illumina. Key authors also include Steven Reilly, PhD, assistant professor at Yale School of Medicine; and Hilary Finucane, PhD, associate member at the Broad Institute and assistant professor at Harvard Medical School and Massachusetts General Hospital. “Genome-wide association studies (GWASs) have successfully linked tens of thousands of loci to complex human traits and diseases,” the authors wrote. However, “Identifying the causal variants and mechanisms that drive complex traits and diseases remains a core problem in human genetics.”

Most DNA changes linked to common diseases such as heart disease and type 2 diabetes occur not within genes themselves—which only constitute about two percent of the genome—but in the vast stretches of non-coding DNA, where regulatory elements exist that control when, where and how strongly our genes are expressed. “Most trait-associated variants individually have small effect sizes and are located in non-coding cis-regulatory elements (CREs)” the team continued. Genetic studies conducted over the last two decades have identified millions of such non-coding disease-related variants throughout the genome. The challenge has been identifying which of the many single-letter changes in these regulatory DNA regions affect gene activity, fine-tuning protein production and in turn shaping disease risk.

To meet this challenge, the researchers used a massively parallel reporter assay, which allowed them to test the effects of 221,412 single-letter DNA variants at the same time across different cell types, including brain, liver and blood cells. “Designed to measure CREs from both promoters and distal elements, MPRAs effectively detect responses to a diverse range of transcription factors (TFs),” they explained. Each stretch of DNA was paired with a molecular tag, or reporter, that they could directly measure to see whether a variant increased, decreased, or had no effect on gene activity—an important step in understanding how regulatory DNA changes may affect health. The results revealed over 13,000 single-letter variants that influence how strongly a gene is expressed. While most act independently, the team found that about 11% behaved differently than expected when combined with a nearby variant. This surprising result suggests some genetic risk of disease depends on specific combinations of variants whose whole is greater than the sum of its parts.

These insights revealed potential links to human health. In some cases, pairs of variants were associated with gene activity linked to lower levels of LDL, or “bad” cholesterol. Other combinations appear to affect a gene associated with blood pressure. The team also identified two variants near the ESS2 gene—associated with developmental disorders—whose combined effect on gene expression was greater than would be expected from either variant alone.

Functional dissection of complex trait variants at single-nucleotide resolution [The Jackson Laboratory]
Functional dissection of complex trait variants at single-nucleotide resolution [The Jackson Laboratory]

In another example, the researchers pinpointed a single variant associated with long-term blood sugar control discovered in people of European ancestry. Based on its molecular behavior, they predicted that similar but previously understudied variants, found predominantly in people of African ancestry, would show a similar association. Follow-up analysis confirmed that prediction, underscoring the importance of understanding genetic mechanisms across diverse populations.

While the study identified which DNA variants regulate specific protein-coding genes in the brain, liver and blood cells, additional experiments will be needed to determine how those variants ultimately influence traits and disease risk. Given the body’s many tissues and thousands of distinct cell types, switching genes on or off in a single cell type is only one piece of a much larger puzzle in determining health outcomes. “We caution that conclusively demonstrating causality for a cellular or organismal phenotype requires genetic manipulation, such as endogenously modifying variants in their native context, or modification of the CRE harboring the variant, combined with appropriate phenotypic read-outs,” they wrote. In addition, millions of genetic variants remain untested. Even so, the researchers say the findings can already begin strengthening how scientists study genetic variation and how they influence health traits.

In summary, the team wrote in their paper, “… our approach systematically measures the regulatory effects of hundreds of thousands of trait-associated variants, dissects their mechanisms, identifies their epistatic effects, and reveals the complex interplay between common regulatory variants and their sequence contexts.”

Tewhey added, “These findings do more than explain known disease associations. They provide training data to build predictive models of the effects of variants we haven’t yet studied or that remain undiscovered.”

Tewhey, Reilly, and their colleagues recently created such a model with this data. Published in Nature in 2024, they used this model to design synthetic DNA sequences that could selectively turn genes on in distinct tissue types one at a time. It also builds on works by Tewhey and by Ulirsch published in 2016 while colleagues at Broad. Together, these advances point toward a future where genetic risk can be more accurately predicted and where therapies can be designed to act only in the tissues where they are needed most.