当前位置: X-MOL 学术Autom. Constr. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Data-driven automatic classification model for construction accident cases using natural language processing with hyperparameter tuning
Automation in Construction ( IF 10.3 ) Pub Date : 2024-05-09 , DOI: 10.1016/j.autcon.2024.105458
Louis Kumi , Jaewook Jeong , Jaemin Jeong

The construction industry, while vital to societal progress, is marred by a high incidence of accidents and injuries. Manual classification of accident cases is intensive and susceptible to human bias. This study addresses this challenge by developing an automated accident case classification system for the construction industry using Natural Language Processing and machine learning techniques. This study was conducted using the following steps: (1) Establishment of dataset (2) Korean Natural Language Processing (3) Selection of machine learning models (4) Model evaluation. The models exhibited competitive performance, demonstrating high accuracy, precision, and recall rates across all classification tasks. XGBoost outperformed NB, SVM, and KNN for accident type, facility type, and work type with accuracy of 0.80, 0.56, and 0.67, respectively. The results also provided insights into the factors influencing accident classification. This study contributes to construction safety by providing a data-driven foundation for safety decision-making, resource allocation, and benchmarking.

中文翻译:

使用自然语言处理和超参数调整的数据驱动的建筑事故案例自动分类模型

建筑业虽然对社会进步至关重要,但其事故和伤害发生率很高。事故案例的手动分类非常密集并且容易受到人为偏见的影响。本研究通过使用自然语言处理和机器学习技术为建筑行业开发自动化事故案例分类系统来解决这一挑战。本研究按照以下步骤进行:(1)数据集的建立(2)韩国自然语言处理(3)机器学习模型的选择(4)模型评估。这些模型表现出有竞争力的性能,在所有分类任务中都表现出较高的准确性、精确度和召回率。 XGBoost 在事故类型、设施类型和工作类型方面优于 NB、SVM 和 KNN,准确率分别为 0.80、0.56 和 0.67。结果还提供了对影响事故分类的因素的见解。这项研究通过为安全决策、资源分配和基准测试提供数据驱动的基础,为施工安全做出贡献。
更新日期:2024-05-09
down
wechat
bug