当前位置: X-MOL 学术Geroscience › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Reliable detection of stochastic epigenetic mutations and associations with cardiovascular aging
GeroScience ( IF 5.6 ) Pub Date : 2024-05-13 , DOI: 10.1007/s11357-024-01191-3
Yaroslav Markov , Morgan Levine , Albert T. Higgins-Chen

Stochastic epigenetic mutations (SEMs) have been proposed as novel aging biomarkers to capture heterogeneity in age-related DNA methylation changes. SEMs are defined as outlier methylation patterns at cytosine-guanine dinucleotide sites, categorized as hypermethylated (hyperSEM) or hypomethylated (hypoSEM) relative to a reference. Because SEMs are defined by their outlier status, it is critical to differentiate extreme values due to technical noise or data artifacts from those due to real biology. Using technical replicate data, we found SEM detection is not reliable: across 3 datasets, 24 to 39% of hypoSEM and 46 to 67% of hyperSEM are not shared between replicates. We identified factors influencing SEM reliability—including blood cell type composition, probe beta-value statistics, genomic location, and presence of SNPs. We used these factors in a training dataset to build a machine learning-based filter that removes unreliable SEMs, and found this filter enhances reliability in two independent validation datasets. We assessed associations between SEM loads and aging phenotypes in the Framingham Heart Study and discovered that associations with aging outcomes were in large part driven by hypoSEMs at baseline methylated probes and hyperSEMs at baseline unmethylated probes, which are the same subsets that demonstrate highest technical reliability. These aging associations were preserved after filtering out unreliable SEMs and were enhanced after adjusting for blood cell composition. Finally, we utilized these insights to formulate best practices for SEM detection and introduce a novel R package, SEMdetectR, which uses parallel programming for efficient SEM detection with comprehensive options for detection, filtering, and analysis.



中文翻译:

可靠检测随机表观遗传突变及其与心血管衰老的关联

随机表观遗传突变 (SEM) 已被提议作为新型衰老生物标志物,以捕获与年龄相关的 DNA 甲基化变化的异质性。 SEM 被定义为胞嘧啶-鸟嘌呤二核苷酸位点的异常甲基化模式,相对于参考分类为高甲基化 (hyperSEM) 或低甲基化 (hypoSEM)。由于 SEM 是由异常值状态定义的,因此区分技术噪音或数据伪影造成的极值与真实生物学造成的极值至关重要。使用技术重复数据,我们发现 SEM 检测并不可靠:在 3 个数据集中,24% 至 39% 的 hyperSEM 和 46% 至 67% 的 hyperSEM 在重复之间不共享。我们确定了影响 SEM 可靠性的因素,包括血细胞类型组成、探针 β 值统计、基因组位置和 SNP 的存在。我们在训练数据集中使用这些因素来构建基于机器学习的过滤器,以消除不可靠的 SEM,并发现该过滤器增强了两个独立验证数据集中的可靠性。我们在 Framingham 心脏研究中评估了 SEM 负荷与衰老表型之间的关联,发现与衰老结果的关联在很大程度上是由基线甲基化探针的 hyperSEM 和基线非甲基化探针的 hyperSEM 驱动的,它们是表现出最高技术可靠性的相同子集。这些衰老关联在过滤掉不可靠的 SEM 后得以保留,并在调整血细胞成分后得到增强。最后,我们利用这些见解制定了 SEM 检测的最佳实践,并引入了一种新颖的 R 软件包SEMDetectR,它使用并行编程来实现高效的 SEM 检测,并提供全面的检测、过滤和分析选项。

更新日期:2024-05-13
down
wechat
bug