generative-ai-produces-small-enzymes-for-bioprocessing
Generative AI Produces Small Enzymes for Bioprocessing

Generative AI Produces Small Enzymes for Bioprocessing

More than 20 years ago, scientists at Pfizer touted the benefits of using enzymes in various steps in bioprocessing. In some situations, though, bioprocessors use the smallest proteins that can be made. As Hiroyuki Hamada, DEng, assistant professor of bioscience and biotechnology at Kyushu University in Fukuoka, Japan, and his colleagues recently reported: “The construction of small proteins by removing amino acid subsequences that are not involved in function, activity, or structure is crucial for bioprocessing and drug development.”

Nonetheless, getting the right small enzymes for a bioprocess remains challenging. As Hamada’s team pointed out: “Traditional design methods often focus on reconstructing functional motifs, but they face challenges in stabilizing structure and reproducing function.”

In search of a better approach to designing and developing these small proteins, Hamada’s team turned to ProtGPT2. In 2022, Noelia Ferruz, PhD, then a postdoctoral researcher at Universität Bayreuth in Bavaria, Germany, and now running her own lab at the Centre for Genomic Regulation in Barcelona, and her colleagues unveiled ProtGPT2, which they described as “a language model trained on the protein space that generates de novo protein sequences following the principles of natural ones.”

To put ProtGPT2 to the test in an in silico approach to making small enzymes for bioprocessing, Hamada and his colleagues started with malate dehydrogenase (MDH). The scientists collected amino acid data on MDH, used ProtGPT2 to generate sequences that were smaller than the natural version of this enzyme, and then analyzed the sequences.

“Population analysis, including multiple sequence alignment (MSA) and t-distributed stochastic neighbor embedding (tSNE), revealed that ProtGPT2 for MDH identified functional motifs of MDH and incorporated them into the generated sequences,” noted Hamada’s team, and “the generated sequences were highly similar to natural MDH sequences.”

Additionally, 9 of 10 randomly selected sequences from the ProtGPT2-generated enzymes were novel variants. AlphaFold2 showed that the 3D structures of these variants resembled structures in natural MDH, and InterPro revealed active sites in two of the novel sequences.

Based on these results, Hamada’s team conclude: “ProtGPT2 for MDH has the potential to design amino acid sequence candidates for small MDHs.” If similar results can be produced with other enzymes, bioprocessors might move toward an in silico-based method of making smaller versions.