当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Simultaneously improving accuracy and computational cost under parametric constraints in materials property prediction tasks
Journal of Cheminformatics ( IF 8.6 ) Pub Date : 2024-02-16 , DOI: 10.1186/s13321-024-00811-6
Vishu Gupta , Youjia Li , Alec Peltekian , Muhammed Nur Talha Kilic , Wei-keng Liao , Alok Choudhary , Ankit Agrawal

Modern data mining techniques using machine learning (ML) and deep learning (DL) algorithms have been shown to excel in the regression-based task of materials property prediction using various materials representations. In an attempt to improve the predictive performance of the deep neural network model, researchers have tried to add more layers as well as develop new architectural components to create sophisticated and deep neural network models that can aid in the training process and improve the predictive ability of the final model. However, usually, these modifications require a lot of computational resources, thereby further increasing the already large model training time, which is often not feasible, thereby limiting usage for most researchers. In this paper, we study and propose a deep neural network framework for regression-based problems comprising of fully connected layers that can work with any numerical vector-based materials representations as model input. We present a novel deep regression neural network, iBRNet, with branched skip connections and multiple schedulers, which can reduce the number of parameters used to construct the model, improve the accuracy, and decrease the training time of the predictive model. We perform the model training using composition-based numerical vectors representing the elemental fractions of the respective materials and compare their performance against other traditional ML and several known DL architectures. Using multiple datasets with varying data sizes for training and testing, We show that the proposed iBRNet models outperform the state-of-the-art ML and DL models for all data sizes. We also show that the branched structure and usage of multiple schedulers lead to fewer parameters and faster model training time with better convergence than other neural networks. Scientific contribution: The combination of multiple callback functions in deep neural networks minimizes training time and maximizes accuracy in a controlled computational environment with parametric constraints for the task of materials property prediction.

中文翻译:

在材料性能预测任务中,在参数约束下同时提高准确性和计算成本

使用机器学习 (ML) 和深度学习 (DL) 算法的现代数据挖掘技术已被证明在使用各种材料表示的基于回归的材料属性预测任务中表现出色。为了提高深度神经网络模型的预测性能,研究人员尝试添加更多层并开发新的架构组件来创建复杂的深度神经网络模型,以帮助训练过程并提高预测能力。最终模型。然而,通常这些修改需要大量的计算资源,从而进一步增加已经很大的模型训练时间,这通常是不可行的,从而限制了大多数研究人员的使用。在本文中,我们研究并提出了一种用于基于回归的问题的深度神经网络框架,该框架由完全连接的层组成,可以使用任何基于数值向量的材料表示作为模型输入。我们提出了一种新颖的深度回归神经网络 iBRNet,具有分支跳跃连接和多个调度器,可以减少用于构建模型的参数数量,提高准确性,并减少预测模型的训练时间。我们使用代表各自材料的元素分数的基于组合的数值向量来执行模型训练,并将其性能与其他传统 ML 和几种已知的 DL 架构进行比较。使用具有不同数据大小的多个数据集进行训练和测试,我们表明所提出的 iBRNet 模型在所有数据大小上都优于最先进的 ML 和 DL 模型。我们还表明,与其他神经网络相比,分支结构和多个调度器的使用导致参数更少、模型训练时间更快、收敛性更好。科学贡献:深度神经网络中多个回调函数的组合最大限度地减少了训练时间,并在受控计算环境中最大限度地提高了材料性能预测任务参数约束的准确性。
更新日期:2024-02-16
down
wechat
bug