Fusion-driven deep feature network for enhanced object detection and tracking in video surveillance systems,Information Fusion

当前位置： X-MOL 学术 › Inform. Fusion › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Fusion-driven deep feature network for enhanced object detection and tracking in video surveillance systems
Information Fusion ( IF 18.6 ) Pub Date : 2024-04-18 , DOI: 10.1016/j.inffus.2024.102429
Deepak Kumar Jain , Xudong Zhao , Chenquan Gan , Piyush Kumar Shukla , Amar Jain , Sourabh Sharma

Object detection and tracking (ODT) is a crucial research area in video surveillance (VS) systems and poses a significant challenge in computer vision and image processing. The primary objective is to identify objects of various classes from video sequences for detection and tracking. The object detection and tracking process involves extracting moving objects from frames and tracking them over time. With the advancement of computer intelligence, this study proposes a novel optimized deep fused learning (ODFL) model for efficient object detection and tracking in video surveillance systems. The optimized deep fused learning model focuses primarily on detecting and tracking multiple objects present in video frames. Initially, the input video undergoes pre-processing by converting it into video frames and applying Gaussian filtering to eliminate noise and enhance frame quality. Next, the feature fusion model employs the Dense Convolution Feature Fusion Network (D-ConvFFN) to extract and fuse relevant features. Subsequently, the Enhanced RefineDet-based Fire Hawk (ERFH) object detection module is utilized by the optimized deep fused learning model to efficiently recognize multiple objects in the video frames. The hyperparameters are optimized using the Fire Hawk optimizer. Once the objects are recognized, a softmax classifier categorizes them into different classes. The Hungarian-based SORT model is used for multi-object tracking. The proposed optimized deep fused learning model is implemented in Python, evaluated on the PETS S2 2009 and UA-DETRAC datasets, and assessed in terms of various evaluation measures. The optimized deep fused learning model’s performance is compared with prevailing architectures, and it achieves a maximum accuracy of 99.16% for the PETS S2 2009 dataset and 99.42% for the UA-DETRAC dataset, outperforming existing classifiers for multi-class object detection and tracking.

中文翻译：

融合驱动的深度特征网络，用于增强视频监控系统中的对象检测和跟踪

目标检测和跟踪（ODT）是视频监控（VS）系统的一个重要研究领域，对计算机视觉和图像处理提出了重大挑战。主要目标是从视频序列中识别各种类别的对象以进行检测和跟踪。对象检测和跟踪过程涉及从帧中提取移动对象并随时间跟踪它们。随着计算机智能的进步，本研究提出了一种新颖的优化深度融合学习（ODFL）模型，用于视频监控系统中的高效目标检测和跟踪。优化的深度融合学习模型主要侧重于检测和跟踪视频帧中存在的多个对象。首先，输入视频经过预处理，将其转换为视频帧并应用高斯滤波来消除噪声并提高帧质量。接下来，特征融合模型采用密集卷积特征融合网络（D-ConvFFN）来提取和融合相关特征。随后，优化的深度融合学习模型利用基于增强型RefineDet的Fire Hawk（ERFH）目标检测模块来有效识别视频帧中的多个目标。超参数使用 Fire Hawk 优化器进行优化。一旦识别出对象，softmax 分类器就会将它们分类为不同的类。基于匈牙利的 SORT 模型用于多目标跟踪。所提出的优化深度融合学习模型在Python中实现，在PETS S2 2009和UA-DETRAC数据集上进行评估，并根据各种评估指标进行评估。优化后的深度融合学习模型的性能与主流架构进行比较，其最高准确率达到了 99。PETS S2 2009 数据集为 16%，UA-DETRAC 数据集为 99.42%，在多类对象检测和跟踪方面优于现有分类器。

更新日期：2024-04-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>