DOI:10.3969/j.issn.1003-5060.2024.05.002
一种双路并行的大规模手势识别模型
曹一丹,王青山,王琦
(合肥工业大学数学学院,安徽合肥230601)
摘要
文章以大规模手势为研究对象, 提出一种基于肌电信号(electromyography, EMG)分支和惯性测量单元(inertial measurement unit, IMU)分支的双路并行手势识别模型。首先, 设计双路并行模型来充分提取数据特征, EMG 分支利用二维卷积神经网络设计双流结构, 分别关注 EMG 信号的空间和通道变化, IMU 分支在卷积长短时记忆(convolutional long short-term memory, ConvLSTM)网络基础上引入时间机制, 将空间信息与时间信息融合; 其次, 对模型预训练并根据预训练模型进行参数微调, 提高模型泛化性; 最后, 在 500 个常用的中国手语手势上进行测试, 结果表明, 该模型平均识别率为 82.1%, 与 SignSpeaker 和 CG-Recognizer 相比分别提高了 21.0% 和 6.8%。
关键词
预训练;手势识别;深度学习;肌电信号(EMG);惯性测量单元(IMU)
中图分类号:TP183
文献标志码:A
文章编号:1003-5060(2024)05-0585-06
A large-scale gesture recognition model with dual-path parallel
CAO Yidan, WANG Qingshan, WANG Qi
(School of Mathematics, Hefei University of Technology, Hefei 230601, China)
Abstract
In this paper, a dual-path parallel gesture recognition model based on the electromyography (EMG) branch and the inertial measurement unit (IMU) branch is proposed for large-scale gestures. Firstly, the dual-path parallel model is designed to fully extract the data features. The EMG branch uses a two-dimensional convolutional neural network to design a dual-stream structure to focus on the spatial and channel variations of EMG signals, respectively. The IMU branch introduces a temporal mechanism based on the convolutional long short-term memory (ConvLSTM) network to fuse spatial and temporal information. Secondly, the model is pre-trained and the parameters are fine-tuned according to the pre-trained model to improve the generalization of the model. Finally, the model is tested on 500 commonly used Chinese sign language gestures, and the average recognition rate of the model is 82.1%, which is 21.0% and 6.8% higher than that of SignSpeaker and CG-Recognizer, respectively.
Keywords
pre-training; gesture recognition; deep learning; electromyography(EMG); inertial measurement unit(IMU)
收稿日期:2023-04-07
修回日期:2023-05-24
基金项目:安徽省自然科学基金资助项目(2208085MF165)