融合双重注意力机制的戴口罩人脸识别方法

计算机与现代化 ›› 2023, Vol. 0 ›› Issue (02): 72-77.

融合双重注意力机制的戴口罩人脸识别方法

（上海海事大学信息工程学院，上海 201306）

出版日期:2023-04-10 发布日期:2023-04-10
作者简介:盛江岸(1997—)，男，安徽合肥人，硕士研究生，研究方向：图像处理，人脸识别，E-mail: 1730686636@qq.com；陈淑荣(1972—)，女，山西稷山人，副教授，硕士，研究方向：现代通信网络及控制，图像处理，视频分析处理。
基金资助:
陕西省重点研发计划项目（2022GY-039）

Mask-wearing Face Recognition Method Fused with Dual Attention Mechanism

（College of Information Engineering， Shanghai Maritime University， Shanghai 201306， China）

Online:2023-04-10 Published:2023-04-10

摘要/Abstract

摘要： 针对现有人脸识别模型无法从戴口罩人脸中有效提取区域特征问题，提出融合双重注意力机制的戴口罩人脸识别模型。首先将自建的戴口罩人脸图像作为输入数据，以ResNet50为基准网络，向残差块中引入协调注意力与分割注意力机制。其中协调注意力用于减少口罩区域特征提取，降低口罩区域特征干扰；分割注意力用于细粒度提取非口罩区域特征，从关键部位提取更多特征。然后使用ArcFace分类函数优化分类边界，再结合交叉熵损失函数作为约束，实现戴口罩人脸精细识别。实验结果表明，本文模型在测试集取得95.2%的识别准确率，与ResNet50、AttentionNet模型相比，识别准确率分别提高1个百分点、1.5个百分点。

关键词: 戴口罩人脸识别模型, 协调注意力, 分割注意力, ArcFace分类函数, 交叉熵损失函数

Abstract: To address the problem that existing face recognition models cannot effectively extract regional features from faces wearing masks， a face recognition model incorporating a dual attention mechanism is proposed for faces wearing masks. Firstly， a self-constructed face image wearing a mask is used as input data， and ResNet50 is used as the benchmark network to introduce coordinate attention and split attention mechanisms into the residual blocks， where coordinate attention is used to reduce feature extraction in the mask region and reduce feature interference in the mask region； Split attention is used to extract non-mask region features at a fine granularity and extract more features from key areas. The ArcFace classification function is then used to optimize the classification boundary， combined with a cross-entropy loss function as a constraint， to achieve fine-grained recognition of faces wearing masks. The experimental results show that the model in this paper achieves 95.2% recognition accuracy in the test set， which is 1 percent point and 1.5 percent point higher than that of ResNet50 and AttentionNet models respectively.

Key words: mask-wearing face recognition model, coordinate attention, split attention, ArcFace classification function, cross-entropy loss function

盛江岸, 陈淑荣. 融合双重注意力机制的戴口罩人脸识别方法[J]. 计算机与现代化, 2023, 0(02): 72-77.

SHENG Jiang-an, CHEN Shu-rong. Mask-wearing Face Recognition Method Fused with Dual Attention Mechanism[J]. Computer and Modernization, 2023, 0(02): 72-77.

参考文献

［1］李小薪，梁荣华. 有遮挡人脸识别综述:从子空间回归到深度学习［J］. 计算机学报， 2018，41(1):177-207.
［2］徐润昊，程吉祥，李志丹，等. 基于循环生成对抗网络的含遮挡人脸识别［J］. 计算机工程， 2022，48(5):289-296.
［3］ LI Y D， GUO K， LU Y G， et al. Cropping and attention based approach for masked face recognition［J］. Applied Intelligence， 2021，51(5):3012-3025.
［4］ WANG F， JIANG M Q， QIAN C， et al. Residual attention network for image classification［C］// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017:6450-6458.
［5］ HU J， SHEN L， SUN G. Squeeze-and-excitation networks［C］// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:7132-7141.
［6］ PARK J， WOO S， LEE J， et al. BAM: Bottleneck attention module［J］. arXiv preprint arXiv:1807.06514， 2018.
［7］ WOO S， PARK J， LEE J， et al. CBAM: Convolutional block attention module［C］// Computer Vision-ECCV 2018. 2018:3-19.
［8］ SHAO Z W， LIU Z L， CAI J F， et al. Deep adaptive attention for joint facial action unit detection and face alignment［C］// Computer Vision – ECCV 2018. 2018:725-740.
［9］ RAO Y M， LU J W， ZHOU J. Attention-Aware deep reinforcement learning for video face recognition［C］// 2017 IEEE International Conference on Computer Vision (ICCV). 2017:3951-3960.
［10］ HARIRI W. Efficient masked face recognition method during the COVID-19 Pandemic［J］. arXiv preprint arXiv:2105.03026， 2021.
［11］ WENG R L， LU J W， TAN Y P. Robust point set matching for partial face recognition［J］. IEEE Transactions on Image Processing， 2016，25(3):1163-1176.
［12］ HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016:770-778.
［13］ HOU Q B， ZHOU D Q， FENG J S. Coordinate attention for efficient mobile network design［C］// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2021:13708-13717.
［14］ ZHANG H， ZU K K， LU J， et al. EPSANet: An efficient pyramid split attention block on convolutional neural network［J］. arXiv preprint arXiv:2105.14447， 2021.
［15］ DENG J K， GUO J， XUE N N， et al. ArcFace: Additive angular margin loss for deep face recognition［C］// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019:4685-4694.
［16］ SANTURKAR S， TSIPRAS D， ILYAS A， et al. How does batch normalization help optimization?［C］// NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018:2488-2498.
［17］ HE K M， ZHANG X Y， REN S Q， et al. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification［C］// 2015 IEEE International Conference on Computer Vision (ICCV). 2015:1026-1034.
［18］ AGARAP A F. Deep learning using rectified linear units (ReLU)［J］. arXiv preprint arXiv:1803.08375， 2018.
［19］ JANG E， GU S X， POOLE B. Categorical Reparameterization with gumbel-softmax［J］. arXiv preprint arXiv:1611.01144， 2016.
［20］ YI D， LEI Z， LIAO S C， et al. Learning face representation from scratch［J］. arXiv preprint arXiv:1411.7923， 2014.
［21］ CAO Q， SHEN L， XIE W D， et al. VGGFace2: A dataset for recognising faces across pose and age［C］// 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). 2018:67-74.
［22］ HUANG G B， MATTAR M， BERG T， et al. Labeled faces in the wild: A database for studying face recognition in unconstrained environments［C］//Workshop on faces in Real-Life Images: Detection， Alignment， and Recognition. 2008.
［23］ ANWAR A， RAYCHOWDHURY A. Masked face recognition for secure authentication［J］. arXiv preprint arXiv:2008.11104， 2020.
［24］ KING D E. Dlib-ml: A machine learning toolkit［J］. Journal of Machine Learning Research， 2009，10:1755-1758.
［25］ ZHANG Q， YANG Y. SA-Net: Shuffle Attention for Deep Convolutional Neural Networks［C］// ICASSP 2021 - 2021 IEEE International Conference on Acoustics， Speech and Signal Processing (ICASSP). 2021:2235-2239.
［26］ CHEN S， LIU Y， GAO X， et al. MobileFaceNets: Efficient CNNs for accurate real-time face verification on mobile devices［C］// Biometric Recognition. 2018:428-438.
［27］ SELVARAJU R R， COGSWELL M， DAS A， et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization［C］// 2017 IEEE International Conference on Computer Vision (ICCV). 2017:618-626.