当前位置: X-MOL 学术Nat. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Transfer learning enables identification of multiple types of RNA modifications using nanopore direct RNA sequencing
Nature Communications ( IF 16.6 ) Pub Date : 2024-05-14 , DOI: 10.1038/s41467-024-48437-4
You Wu , Wenna Shao , Mengxiao Yan , Yuqin Wang , Pengfei Xu , Guoqiang Huang , Xiaofei Li , Brian D. Gregory , Jun Yang , Hongxia Wang , Xiang Yu

Nanopore direct RNA sequencing (DRS) has emerged as a powerful tool for RNA modification identification. However, concurrently detecting multiple types of modifications in a single DRS sample remains a challenge. Here, we develop TandemMod, a transferable deep learning framework capable of detecting multiple types of RNA modifications in single DRS data. To train high-performance TandemMod models, we generate in vitro epitranscriptome datasets from cDNA libraries, containing thousands of transcripts labeled with various types of RNA modifications. We validate the performance of TandemMod on both in vitro transcripts and in vivo human cell lines, confirming its high accuracy for profiling m6A and m5C modification sites. Furthermore, we perform transfer learning for identifying other modifications such as m7G, Ψ, and inosine, significantly reducing training data size and running time without compromising performance. Finally, we apply TandemMod to identify 3 types of RNA modifications in rice grown in different environments, demonstrating its applicability across species and conditions. In summary, we provide a resource with ground-truth labels that can serve as benchmark datasets for nanopore-based modification identification methods, and TandemMod for identifying diverse RNA modifications using a single DRS sample.



中文翻译:

转移学习能够使用纳米孔直接 RNA 测序来识别多种类型的 RNA 修饰

纳米孔直接 RNA 测序 (DRS) 已成为 RNA 修饰鉴定的强大工具。然而,在单个 DRS 样本中同时检测多种类型的修饰仍然是一个挑战。在这里,我们开发了 TandemMod,这是一个可转移的深度学习框架,能够检测单个 DRS 数据中多种类型的 RNA 修饰。为了训练高性能 TandemMod 模型,我们从 cDNA 文库生成体外表观转录组数据集,其中包含数千个标记有各种类型 RNA 修饰的转录本。我们验证了 TandemMod 在体外转录本和体内人类细胞系上的性能,证实了其对 m 6 A 和 m 5 C 修饰位点分析的高精度。此外,我们还进行迁移学习来识别其他修饰,例如 m 7 G、Ψ 和肌苷,从而在不影响性能的情况下显着减少训练数据大小和运行时间。最后,我们应用 TandemMod 识别了不同环境下生长的水稻中的 3 种 RNA 修饰,证明了其在不同物种和条件下的适用性。总之,我们提供了带有真实标签的资源,可以作为基于纳米孔的修饰识别方法的基准数据集,以及使用单个 DRS 样本识别多种 RNA 修饰的 TandemMod。

更新日期:2024-05-14
down
wechat
bug