基于风格迁移的手势分割方法

摘要/Abstract

摘要： 基于全卷积神经网络的手势分割方法过于依赖大量精准标注的训练样本，同时由于提取特征中缺乏足够的上下文信息，常出现类内不一致的错分现象。针对上述问题，本文提出一种基于风格迁移的手势分割方法。首先选择HGR-Net手势分割网络的前5层作为主干网络，并在主干网各层添加上下文信息增强层，使用全局均值池化操作，结合通道注意机制，增强显著性特征通道的权值，保证特征上下文信息的连续性，从而解决类内不一致问题；其次，本文还提出一种基于风格迁移的领域自适应方法，使用VGG网络，对源域测试图像进行风格迁移预处理，使其同时具有自身内容和目标域训练样本图像的风格，提高本文的手势分割模型的泛化能力，从而解决跨域样本的分割问题。使用OUHANDS数据集进行测试，本文的手势分割结果mIoU和MPA分别为0.9143和0.9363，较HGR-Net手势分割网络提高了3.2个百分点和1.8个百分点。使用本文的风格迁移方法，并在自采集数据集上进行测试，迁移后的mIoU和MPA值分别提高了19个百分点和23个百分点。本文的风格迁移领域自适应方法为无标记样本的跨域分割提供了一种新的思路。

关键词: 手势分割, HGR-Net, 上下文, 风格迁移

Abstract: Hand gesture segmentation based on fully convolutional networks excessively dependents on the accurate per-pixel annotations of training data. At the same time, the features lack enough context information, which often leads to misclassification with intra-class inconsistency. In order to solve the above issues, a hand gesture segmentation method based on style transfer is proposed. Firstly, the first five layers of hand gesture segmentation network in HGR-Net are selected as the backbone network, and the context information enhancement layer is added to each layer of backbone network. In the context information enhancement layer, global average pool operation and channel attention mechanism are adopted to enhance the weight of the discrimination feature and ensure the continuity of context information in features, so as to solve the intra-class inconsistency. Secondly, in order to improve the generalization ability of the hand gesture segmentation module proposed by this paper, and address the cross-domain samples segmentation problem, a domain adaptive method based on style transfer is proposed. The pre-trained VGG model is used to transfer the source domain testing sample, so as to make the source domain testing sample have both its content and the style of the target domain training sample. Testing on the OUHANDS dataset, the mIoU and MPA values of the proposed method are 0.9143 and 0.9363 respectively, and they are 3.2 and 1.8 percentage points higher than those of HGR-Net. Testing on the self-collection dataset with the style transfer method, the mIoU and MPA values are respectively 19 and 23 percentage points higher than without this method. The domain adaptive method based on style transfer provides a new idea for cross-domain segmentation of unlabeled samples.

Key words: hand gesture segmentation, HGR-Net, context, style transfer

陈明瑶, 徐琨, 李晓旋. 基于风格迁移的手势分割方法[J]. 计算机与现代化, 2021, 0(05): 20-25.

CHEN Ming-yao, XU Kun, LI Xiao-xuan. A Hand Gesture Segmentation Method Based on Style Transfer[J]. Computer and Modernization, 2021, 0(05): 20-25.

参考文献

［1］ DADASHZADEH A, TARGHI A T, TAHMASBI M, et al. HGR-Net: A fusion network for hand gesture segmentation and recognition［J］. IET Computer Vision, 2019,13（8）:700-707.
［2］ LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation［C］// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015:3431-3440.
［3］ HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition［C］// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016:770-778.
［4］ YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions［C］// International Conference on Learning Representations. 2016:1-4.
［5］陈淑環，韦玉科，徐乐，等. 基于深度学习的图像风格迁移研究综述［J］. 计算机应用研究， 2019,36（8）:2250-2255.
［6］ WANG M, DENG W H. Deep visual domain adaptation: A survey［J］. Neurocomputing, 2018,3（12）:135-153.
［7］严秋锋,王红茹,季鸣. 基于颜色均衡和椭圆模型的手势图像分割［J］. 计算机仿真, 2015,32（4）:172-175.
［8］潘丹丹,柳灿雄,聂建华. 复杂背景下的手势分割及掌心检测［J］. 工业控制计算机, 2016,29（1）:109-111.
［9］袁敏,姚恒,刘牮. 结合三帧差分和肤色椭圆模型的动态手势分割［J］. 光电工程, 2016,43（6）:51-56.
［10］覃文军,杨金柱,宋相满,等. 融合GVF Snake与肤色模型的手势轮廓提取方法［J］. 小型微型计算机系统, 2013,36（6）:1405-1408.
［11］范文婕,王命延,杨文姬. 基于深度图像的指尖和掌心特征提取方法［J］. 计算机应用, 2015,35（6）:1791-1794.
［12］ROY K, MOHANTY A, SAHAY R. Deep learning based hand detection in cluttered environment using skin segmentation［C］// 2017 IEEE International Conference on Computer Vision. 2017:640-649.
［13］KIM Y, HWANG I, CHO N I, et al. A new convolutional network-in-network structure and its applications in skin detection［J］. arXiv preprint arXiv: 1701.06190, 2017.
［14］MINHASL K, KHAN T M, ARSALAN M, et al. Accurate pixel-wise skin segmentation using shallow fully convolutional neural network［J］. IEEE Access, 2020,13（4）:1-10.
［15］景庄伟,管海燕,彭代峰,等. 基于深度神经网络的图像语义分割研究综述［J］. 计算机工程, 2020,46（10）:1-17.
［16］GHIFARY M, KLEIJN W B, ZHANG M, et al. Deep reconstruction-classification networks for unsupervised domain adaptation［C］// European Conference on Computer Vision. 2016:597-613.
［17］YI Z, ZHANG H, GONG P T, et al. DualGAN: Unsupervised dual learning for image-to-image translation［C］// IEEE International Conference on Computer Vision. 2017:1-17.

［18］贾颖霞,郎丛妍,冯松鹤. 基于类别相关的领域自适应交通图像语义分割方法［J］. 计算机研究与发展, 2020,57（4）:876-887.

［19］PENG X, HOFFMAN J, STELLA X Y,et al. Fine-to-coarse knowledge transfer for low image classification［C］// 2016 IEEE International Conference on Image Processing. 2016:3683-3687.
［20］LIU M Y, TUZEL O. Coupled generative adversarial networks［C］// Advances in Neural Information Processing Systems. 2016:469-477.
［21］刘欢,郑庆华,罗敏楠,等. 基于跨域对抗学习的零样本分类［J］. 计算机研究与发展, 2019,56（12）:2521-2535.
［22］钱小燕,肖亮,吴慧中. 快速风格迁移［J］. 计算机工程, 2006,32（21）:15-17.
［23］屈时操,林晓,郑晓妹,等. 显著区域保留的图像风格迁移算法［J/OL］. 图学学报: 1-9［2021-01-17］. http://kns.cnki.net/kcms/detail/10.1034.T.20201118.1857.056.html.
［24］LI C, WAND M. Combining Markov random fields and convolutional neural networks for image synthesis［C］// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016:2479-2486.
［25］JOHNSON J, ALAHI A, LI F. Perceptual losses for real-time style transfer and super resolution［C］// European Conference on Computer Vision. 2016:694-711.
［26］LI Y, FANG C, YANG J, et al. Universal style transfer via feature transforms［C］// Advances in Neural Information Processing Systems. 2017:386-396.
［27］SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition［C］// International Conference on Learning Representations. 2015:1-14.
［28］MATILAINEN M, SANGI P, HOLAPPA J,et al. OUHANDS database for hand detection and pose recognition［C］// 2016 6th International Conference on Image Processing Theory, Tools and Applications. 2016:1-5.
［29］王嫣然，陈清亮，吴俊君. 面向复杂环境的图像语义分割方法综述［J］. 计算机科学, 2019,46（9）:36-46.

[1]	杜菲瑀, 王海燕, 姚海洋, 陈晓. 基于领域自适应的水下图像增强算法[J]. 计算机与现代化, 2024, 0(10): 55-60.
[2]	李国鑫, 曲寒冰, 朱成博, 王鑫轩, 胡嘉宝. 基于联合网络的域适应行人重识别算法[J]. 计算机与现代化, 2023, 0(06): 48-55.
[3]	肖立华, 徐畅, 商浩亮, 罗仲达, 吴小忠, 马小丰, 江志文, 陈俊杰. 一种上下文信息融合的安全帽识别算法[J]. 计算机与现代化, 2023, 0(01): 114-119.
[4]	刘亦欣, 王家伟, 李自力. 融合注意力与深度因子分解机的时间上下文推荐模型[J]. 计算机与现代化, 2021, 0(11): 22-27.
[5]	陈圆圆, 刘惠义. 基于生成对抗网络的破损老照片修复[J]. 计算机与现代化, 2021, 0(04): 42-47.
[6]	吴华运,任德均,付磊,郜明,吕义昭,邱吕. 基于改进型SSD算法的空瓶表面缺陷检测[J]. 计算机与现代化, 2020, 0(04): 121-.
[7]	李雨妮，王鹏，肖建力. 一种使用多维度直方图匹配的图像风格迁移算法[J]. 计算机与现代化, 2019, 0(02): 15-.
[8]	张驰，韩立新，徐国夏. 自步上下文感知的相关滤波跟踪算法[J]. 计算机与现代化, 2018, 0(11): 35-.
[9]	文芳，康彩琴，陈立文，丁汇，徐琨，王宁宁. 基于RGBD数据的静态手势识别[J]. 计算机与现代化, 2018, 0(01): 74-77.
[10]	李婧，王建平. 面向共享内存并行程序的测试技术[J]. 计算机与现代化, 2016, 0(8): 40-45.
[11]	赵昉宇. TopicEye：一种基于可视化分析的探索来源#br# 不同话题的可交互方法[J]. 计算机与现代化, 2016, 0(5): 22-32.
[12]	陈肖，王敏，陶铭洋. 基于三维上下文的图像几何分割方法[J]. 计算机与现代化, 2014, 0(4): 38-40.
[13]	李文博;刘艳;苏文瑛. 基于傅里叶的卡通动画形状上下文动作捕捉[J]. 计算机与现代化, 2013, 1(3): 168-171.
[14]	陶维成;. 面向上下文图形可视化挖掘网络行为与管理[J]. 计算机与现代化, 2012, 1(03): 143-146.