当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A general model for predicting enzyme functions based on enzymatic reactions
Journal of Cheminformatics ( IF 8.6 ) Pub Date : 2024-03-31 , DOI: 10.1186/s13321-024-00827-y
Wenjia Qian , Xiaorui Wang , Yu Kang , Peichen Pan , Tingjun Hou , Chang-Yu Hsieh

Accurate prediction of the enzyme comission (EC) numbers for chemical reactions is essential for the understanding and manipulation of enzyme functions, biocatalytic processes and biosynthetic planning. A number of machine leanring (ML)-based models have been developed to classify enzymatic reactions, showing great advantages over costly and long-winded experimental verifications. However, the prediction accuracy for most available models trained on the records of chemical reactions without specifying the enzymatic catalysts is rather limited. In this study, we introduced BEC-Pred, a BERT-based multiclassification model, for predicting EC numbers associated with reactions. Leveraging transfer learning, our approach achieves precise forecasting across a wide variety of Enzyme Commission (EC) numbers solely through analysis of the SMILES sequences of substrates and products. BEC-Pred model outperformed other sequence and graph-based ML methods, attaining a higher accuracy of 91.6%, surpassing them by 5.5%, and exhibiting superior F1 scores with improvements of 6.6% and 6.0%, respectively. The enhanced performance highlights the potential of BEC-Pred to serve as a reliable foundational tool to accelerate the cutting-edge research in synthetic biology and drug metabolism. Moreover, we discussed a few examples on how BEC-Pred could accurately predict the enzymatic classification for the Novozym 435-induced hydrolysis and lipase efficient catalytic synthesis. We anticipate that BEC-Pred will have a positive impact on the progression of enzymatic research.

中文翻译:

基于酶反应预测酶功能的通用模型

准确预测化学反应的酶佣金 (EC) 数对于理解和操纵酶功能、生物催化过程和生物合成规划至关重要。人们已经开发了许多基于机器学习(ML)的模型来对酶反应进行分类,与昂贵且冗长的实验验证相比,显示出巨大的优势。然而,大多数可用的模型在没有指定酶催化剂的情况下根据化学反应记录进行训练,其预测准确性相当有限。在这项研究中,我们引入了 BEC-Pred,一种基于 BERT 的多分类模型,用于预测与反应相关的 EC 数。利用迁移学习,我们的方法仅通过分析底物和产物的 SMILES 序列即可实现对各种酶委员会 (EC) 数字的精确预测。 BEC-Pred 模型优于其他基于序列和图的 ML 方法,准确率高达 91.6%,超出其他方法 5.5%,并表现出优异的 F1 分数,分别提高了 6.6% 和 6.0%。增强的性能凸显了 BEC-Pred 作为可靠基础工具的潜力,可加速合成生物学和药物代谢领域的前沿研究。此外,我们还讨论了一些关于 BEC-Pred 如何准确预测 Novozym 435 诱导的水解和脂肪酶高效催化合成的酶分类的示例。我们预计 BEC-Pred 将对酶学研究的进展产生积极影响。
更新日期:2024-03-31
down
wechat
bug