当前位置: X-MOL 学术Complex Intell. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An optimization method of human skeleton keyframes selection for action recognition
Complex & Intelligent Systems ( IF 5.8 ) Pub Date : 2024-03-30 , DOI: 10.1007/s40747-024-01403-5
Hao Chen , Yuekai Pan , Chenwu Wang

In the action recognition field based on the characteristics of human skeleton joint points, the selection of keyframes in the skeleton sequence is a significant issue, which directly affects the action recognition accuracy. In order to improve the effectiveness of keyframes selection, this paper proposes inflection point frames, and transforms keyframes selection into a multi-objective optimization problem based on it. First, the pose features are extracted from the input skeleton joint point data, which used to construct the pose feature vector of each frame in time sequence; then, the inflection point frames in the sequence are determined according to the flow of momentum of each body part. Next, the pose feature vectors are input into the keyframes multi-objective optimization model, with the fusion of domain information and the number of keyframes; finally, the output keyframes are input to the action classifier. To verify the effectiveness of the method, the MSR-Action3D, the UTKinect-Action and Florence3D-Action, and the 3 public datasets, are chosen for simulation experiments and the results show that the keyframes sequence obtained by this method can significantly improve the accuracy of multiple action classifiers, and the average recognition accuracy of the three data sets can reach 94.6%, 97.6% and 94.2% respectively. Besides, combining the optimized keyframes with deep learning classifier on the NTU RGB + D dataset can make the accuracies reaching 83.2% and 93.7%.



中文翻译:

一种动作识别人体骨骼关键帧选择的优化方法

在基于人体骨骼关节点特征的动作识别领域中,骨骼序列中关键帧的选择是一个重要的问题,它直接影响动作识别的准确性。为了提高关键帧选择的有效性,提出拐点帧,并在此基础上将关键帧选择转化为多目标优化问题。首先,从输入的骨骼关节点数据中提取姿态特征,用于按时间顺序构造每帧的姿态特征向量;然后,根据每个身体部位的动量流来确定序列中的拐点帧。接下来,将姿态特征向量输入到关键帧多目标优化模型中,融合领域信息和关键帧数量;最后,输出关键帧被输入到动作分类器。为了验证该方法的有效性,选取MSR-Action3D、UTKinect-Action和Florence3D-Action以及3个公共数据集进行仿真实验,结果表明该方法得到的关键帧序列能够显着提高准确率多个动作分类器的组合,三个数据集的平均识别准确率分别可以达到94.6%、97.6%和94.2%。此外,将优化后的关键帧与NTU RGB + D数据集上的深度学习分类器相结合,可以使准确率分别达到83.2%和93.7%。

更新日期:2024-03-30
down
wechat
bug