当前位置: X-MOL 学术J. Chem. Inf. Model. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predicting Antimicrobial Peptides Using ESMFold-Predicted Structures and ESM-2-Based Amino Acid Features with Graph Deep Learning
Journal of Chemical Information and Modeling ( IF 5.6 ) Pub Date : 2024-05-13 , DOI: 10.1021/acs.jcim.3c02061
Greneter Cordoves-Delgado 1 , César R. García-Jacas 2
Affiliation  

Currently, antimicrobial resistance constitutes a serious threat to human health. Drugs based on antimicrobial peptides (AMPs) constitute one of the alternatives to address it. Shallow and deep learning (DL)-based models have mainly been built from amino acid sequences to predict AMPs. Recent advances in tertiary (3D) structure prediction have opened new opportunities in this field. In this sense, models based on graphs derived from predicted peptide structures have recently been proposed. However, these models are not in correspondence with state-of-the-art approaches to codify evolutionary information, and, in addition, they are memory- and time-consuming because depend on multiple sequence alignment. Herein, we presented a framework to create alignment-free models based on graph representations generated from ESMFold-predicted peptide structures, whose nodes are characterized with amino acid-level evolutionary information derived from the Evolutionary Scale Modeling (ESM-2) models. A graph attention network (GAT) was implemented to assess the usefulness of the framework in the AMP classification. To this end, a set comprised of 67,058 peptides was used. It was demonstrated that the proposed methodology allowed to build GAT models with generalization abilities consistently better than 20 state-of-the-art non-DL-based and DL-based models. The best GAT models were developed using evolutionary information derived from the 36- and 33-layer ESM-2 models. Similarity studies showed that the best-built GAT models codified different chemical spaces, and thus they were fused to significantly improve the classification. In general, the results suggest that esm-AxP-GDL is a promissory tool to develop good, structure-dependent, and alignment-free models that can be successfully applied in the screening of large data sets. This framework should not only be useful to classify AMPs but also for modeling other peptide and protein activities.

中文翻译:


通过图深度学习,使用 ESMFold 预测结构和基于 ESM-2 的氨基酸特征来预测抗菌肽



目前,抗菌药物耐药性对人类健康构成严重威胁。基于抗菌肽(AMP)的药物是解决这一问题的替代方案之一。基于浅层和深度学习 (DL) 的模型主要根据氨基酸序列构建来预测 AMP。三级(3D)结构预测的最新进展为该领域带来了新的机遇。从这个意义上说,最近提出了基于预测肽结构的图的模型。然而,这些模型与编码进化信息的最先进方法并不相符,此外,由于依赖于多重序列比对,它们非常消耗内存和时间。在此,我们提出了一个框架,用于基于从 ESMFold 预测的肽结构生成的图形表示来创建免对齐模型,其节点以源自进化尺度建模 (ESM-2) 模型的氨基酸级进化信息为特征。实施图注意力网络 (GAT) 来评估该框架在 AMP 分类中的有用性。为此,使用了由 67,058 条肽组成的组。事实证明,所提出的方法允许构建泛化能力始终优于 20 个最先进的非基于深度学习和基于深度学习的模型的 GAT 模型。最好的 GAT 模型是使用源自 36 层和 33 层 ESM-2 模型的进化信息开发的。相似性研究表明,最好的 GAT 模型编码了不同的化学空间,因此它们被融合以显着改进分类。 总的来说,结果表明 esm-AxP-GDL 是开发良好的、结构依赖的、免对齐模型的有希望的工具,可以成功地应用于大型数据集的筛选。该框架不仅可用于 AMP 分类,还可用于建模其他肽和蛋白质活性。
更新日期:2024-05-13
down
wechat
bug