Diagnostic test for systemic sclerosis. [Cesare Ferrari/Getty Images]
Scientists at the Baylor College of Medicine and collaborating institutions used complementary approaches that integrate exome sequencing and evolutionary action machine learning to identify protein changes and their associated mechanisms in systemic sclerosis (SSc), a severe autoimmune disease with complex genetic causes.
The researchers published their study “Integrative exome sequencing and machine learning identify MICB and interferon pathway genes as contributors to sSc risk” in the Annals of Rheumatic Diseases.
A number of genetic contributors have been identified, but others remain unknown, which has impeded the development of targeted therapies. Previous genome-wide association studies (GWAS) that analyzed the frequency of common genetic variants show the strongest genetic contributors located in the human leucocyte antigen (HLA) region on chromosome six.
In this study, researchers led by first author Shamika Ketkar, PhD, performed GWAS using exome sequencing data from 2,559 SSc patient cases and 893 healthy control cases in the Scleroderma Family Registry and DNA Repository at the University of Texas Health Science Center at Houston. They aimed to find novel genes and rare variants contributing to SSc risk.
“What truly surprised and excited us was the discovery and replication of MICB, a gene located within the HLA region but acting independently of the classical HLA genes. MICB had not previously been implicated in systemic sclerosis, and its identification represents a novel genetic contributor and a potential therapeutic target,” said Ketkar, assistant professor of molecular and human genetics at Baylor.
Collaborators in Spain replicated the findings
Collaborators in Spain replicated the findings using previously published European GWAS data comprising nearly 10,000 cases, further strengthening the significance of the findings. At Baylor, the laboratory of Olivier Lichtarge, MD, PhD, used its evolutionary action-machine learning (EAML) framework to analyze the exome sequencing data and prioritize genes with high-impact variants predictive of SSc.
The results once again pointed to MICB, as well as other genes on chromosome six like NOTCH4 and rare missense variants in genes enriched in interferon signaling (a key pathway in the immune system), including IFI44L and IFIT5.
“With our machine learning framework, we are not only identifying whether a variant occurs frequently, but also, using evolutionary data across all species, we are weighing the likelihood the variant is functionally disruptive to the protein and eventually to the patient,” according to Lichtarge, Cullen Chair and professor of molecular and human genetics, biochemistry and molecular biology and pharmacology. “We previously used this method in diseases with much larger genome data sets, like Alzheimer’s disease and heart disease, and in this study, we show that it can be effective in complex diseases with a smaller patient data set.”
Functional impact
To understand the functional impact of the genetic variants identified in the study, researchers integrated publicly available single-cell RNA sequencing data from SSc skin biopsies to resolve cell type-specific expression patterns of risk genes. They also performed expression quantitative trait locus (eQTL) analysis using whole blood datasets to establish regulatory links between disease-associated variants and transcriptomic changes.
MICB and NOTCH4 were found to be expressed in fibroblasts and endothelial cells, two cell types that play central roles in fibrosis and vasculopathy, key clinical features of SSc. These complementary analyses confirmed functional regulatory effects of identified risk genes.
“To solve complex diseases like SSc, we need to combine different approaches and machine learning to the analysis of large DNA, RNA and protein data sets to discover otherwise hidden targets for treatment,” pointed out corresponding author Brendan Lee, MD, PhD, professor, chair, and Robert and Janice McNair Endowed Chair of molecular and human genetic at Baylor.
Other authors who contributed to this work are affiliated with one of the following institutions: Baylor College of Medicine, McGovern Medical School at UTHealth Houston, Institute of Parasitology and Biomedicine Lopez-Neyra and Regeneron Pharmaceuticals.