Steel product number recognition framework using semantic mask-conditioned diffusion model with limited data,Journal of Industrial Information Integration

当前位置： X-MOL 学术 › J. Ind. Inf. Integr. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Steel product number recognition framework using semantic mask-conditioned diffusion model with limited data
Journal of Industrial Information Integration ( IF 15.7 ) Pub Date : 2024-01-17 , DOI: 10.1016/j.jii.2024.100559
Hyeyeon Choi , Jong Pil Yun , Bum Jun Kim , Hyeonah Jang , WooSang Shin , Sang Woo Kim

Steel product number recognition (SPNR) is crucial for efficient product management in the steel industry, and there has been a recent focus on utilizing number recognition algorithms to pursue the automation of product management. Deep learning-based methods have enhanced the performance of SPNR. However, the issue of data scarcity has been overlooked in previous studies. In actual industrial environments, data related to product numbers are scarce, owing to frequently changing environmental factors such as illumination, dust, heating surfaces, and the seasonality of such numbers. This lack of data significantly influences the accuracy of SPNR. To address the issue of data scarcity, this paper proposes an SPNR framework that utilizes images generated from the proposed semantic mask-conditioned diffusion model (SMDM). First, we designed an SMDM architecture comprising encoding parts for the font style and text format based on a diffusion model. Second, the SMDM was trained to generate product number images with desired font styles, text formats, and contents. Finally, the generated images were utilized as training data for the SPNR model. Extensive experiments on three distinct types of real-world datasets demonstrate that the proposed framework yielded significantly higher SPNR accuracy compared with those of existing methods. The experimental results also showed that the SMDM could generate out-of-distribution samples that were not included in the distribution of the small training dataset, thereby improving the distribution diversity of the training data. By addressing the data scarcity problem, our framework can aid in advancing the application of deep learning-based algorithms in the steel industry.

中文翻译：

有限数据下基于语义掩模条件扩散模型的钢铁产品编号识别框架

钢铁产品编号识别（SPNR）对于钢铁行业的高效产品管理至关重要，最近人们关注的是利用编号识别算法来实现产品管理的自动化。基于深度学习的方法提高了 SPNR 的性能。然而，以往的研究却忽视了数据稀缺的问题。在实际的工业环境中，由于光照、灰尘、受热面等环境因素的频繁变化以及这些数字的季节性，与产品数字相关的数据很少。数据的缺乏极大地影响了 SPNR 的准确性。为了解决数据稀缺的问题，本文提出了一种 SPNR 框架，该框架利用从所提出的语义掩模条件扩散模型（SMDM）生成的图像。首先，我们设计了一个 SMDM 架构，包括基于扩散模型的字体样式和文本格式的编码部分。其次，SMDM 经过训练，可以生成具有所需字体样式、文本格式和内容的产品编号图像。最后，生成的图像被用作 SPNR 模型的训练数据。对三种不同类型的现实数据集的广泛实验表明，与现有方法相比，所提出的框架产生了显着更高的 SPNR 精度。实验结果还表明，SMDM可以生成不包含在小训练数据集分布中的分布外样本，从而提高训练数据的分布多样性。通过解决数据稀缺问题，我们的框架可以帮助推进基于深度学习的算法在钢铁行业的应用。

更新日期：2024-01-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>