Paper
28 April 2023 Gesture recognition fusion two-stream 3D CNN and FPN
Yingbo Wang, Hua Li
Author Affiliations +
Proceedings Volume 12610, Third International Conference on Artificial Intelligence and Computer Engineering (ICAICE 2022); 1261034 (2023) https://doi.org/10.1117/12.2671044
Event: Third International Conference on Artificial Intelligence and Computer Engineering (ICAICE 2022), 2022, Wuhan, China
Abstract
In the direction of VR/AR human–machine interaction, natural and simple dynamic gesture recognition research has attracted much attention. For the sake of improve the accuracy of dynamic gesture recognition in Human–Machine Interaction, this paper proposes a new dynamic gesture recognition method FPN-3DResNeXt, which combines two-stream three-dimensional convolutional neural network (3DResNeXt) and feature pyramid (FPN). This method improves the structure of the 3DResNeXt network, adds feature pyramid and attention channel, optimizes the model parameters, and then improves the recognition accuracy; for the sake of improve the convergence speed and stability of the model, it is proposed to add batch normalization (BN) Further optimization of the network reduces the training time. The experimental results show that the dynamic gesture recognition rate of the method proposed in this paper is 95.30%, which is 2.1% higher than that of the gesture recognition method based on 3DResNeXt by comparing with various 3D convolution methods on the EgoGesture dataset, and it has better stability.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Yingbo Wang and Hua Li "Gesture recognition fusion two-stream 3D CNN and FPN", Proc. SPIE 12610, Third International Conference on Artificial Intelligence and Computer Engineering (ICAICE 2022), 1261034 (28 April 2023); https://doi.org/10.1117/12.2671044
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

RGB color model

Gesture recognition

3D modeling

Convolution

Neural networks

Back to Top