基于孪生网络结构的单样本图例检测方法

摘要/Abstract

摘要： 针对现有深度学习方法训练难、检测慢、训练数据难以获取等问题，在图例检测问题上，提出一种新的解决方法。以高效的卷积神经网络为骨干网络，并根据图例宽高比固定、具有个体独立性等特点，使用一种新的SiameseSSD检测框架进行目标检测。该框架包含了用于特征提取的孪生网络结构子网和用于分类和回归的改良SSD子网。同时利用数据增强技术和特殊的图片配对算法训练模型，通过解决单样本问题、调整网络结构和检测方法以检测大分辨率施工图。该方法在施工图数据集上的实验结果表明，该图例检测方法是一种新的解决单样本学习任务的方法，准确率达到91.3%，检测速度达到61帧/s，相比于其他现有的目标检测方式有一定的优势，几乎能够满足实际工程的工作需求。

关键词: 图例检测, 孪生网络, 数据增强, 单样本学习

Abstract: In view of the problems such as the difficulty of training, the slow detection and the difficulty of obtaining the training data in the existing deep learning methods, a new solution is proposed for the single sample learning problem. Based on the structure of convolutional neural network, combined with the characteristics of fixed aspect ratio and independence of legend, a new SiameseSSD detection frame is used for target detection. The framework includes a siamese subnet for feature extraction and an improved SSD subnet for classification and regression. At the same time, we use the data enhancement technology to expand the sample, then make the data set and train the model and adjust network structure and detection method to detect large-resolution construction drawings. The experimental results of this method on the construction drawing data set show that this method is a new method to solve the single sample learning task, with an accuracy of 91.3%, the detection speed reached 61 fps. Compared with the existing top level, it has certain advantages and meets the actual work needs.

Key words: legend detection, siamese network, data enhancement, one-shot learning

王超奇, 宫法明. 基于孪生网络结构的单样本图例检测方法[J]. 计算机与现代化, 2020, 0(12): 116-122.

WANG Chao-qi, GONG Fa-ming. Detection Method of One-shot Legend Based on Siamese Neural Networks[J]. Computer and Modernization, 2020, 0(12): 116-122.

参考文献

［1］ RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge［J］. International Journal of Computer Vision, 2015,115(3):211-252.
［2］ REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks［C］// Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015:91-99.
［3］ LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector［C］// European Conference on Computer Vision(ECCV 2016). Springer, 2016:21-37.
［4］ REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection［C］// Computer Vision and Pattern Recognition (CVPR 2016). IEEE, 2016:779-788.
［5］〖KG-*3〗REDMON J, FARHADI A. YOLO9000: Better, faster, stronger［C］// Computer Vision and Pattern Recognition(CVPR 2017). IEEE, 2017:6517-6525.
［6］ VINYALS O, BLUNDELL C, LILLICRAP T, et al. Matching networks for one shot learning［C］// Proceedings of the 30th International Conference on Neural Information Processing Systems. ACM, 2016:3637-3645.
［7］ CHOPRA S, HADSELL R, LECUN Y. Learning a similarity metric discriminatively, with application to face verification［C］// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2005:539-546.
［8］〖KG-*4〗KOCH G, ZEMEL R, SALAKHUTDINOV R. Siamese neural networks for one-shot image recognition［C］// Proceedings of the 32nd International Conference on Machine Learning. 2015.
［9］ SANTORO A, BARTUNOV S, BOTVINICK M, et al. One-shot learning with memory-augmented neural networks［J］. arXiv preprint arXiv:1605.06065, 2016.
［10］BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional siamese networks for object tracking［C］// European Conference on Computer Vision. Springer, 2016:850-865.
［11］LI B, YAN J J, WU W, et al. High performance visual tracking with siamese region proposal network［C］// IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:8971-8980.
［12］ZHANG T F, ZHANG Y, SUN X, et al. Comparison network for one-shot conditional object detection［J］. Computer Vision and Pattern Recognition, 2019: arXiv:1904.02317.
［13］ZHANG S F, ZHU X Y, LEI Z, et al. S3FD: Single shot scale-invariant face detector［C］// IEEE International Conference on Computer Vision (ICCV). IEEE, 2017:192-201.
［14］LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection［C］// IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017:936-944.
［15］LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition［J］. Proceedings of the IEEE, 1998,86(11):2278-2324.
［16］Darknet. Darknet: Open Source Neural Networks in C［EB/OL］. ［2020-02-15］. http://pjreddie.com/darknet/.
［17］VAN ETTEN A. You only look twice: Rapid multi-scale object detection in satellite imagery［J］. Computer Vision and Pattern Recognition, 2018:arXiv:1805.09512.
［18］KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks［C］// Proceedings of the 25th International Conference on Neural Information Processing Systems. ACM, 2012:1097-1105.
［19］IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift［J］. Machine Learning, 2015:arXiv:1502.03167.
［20］TAIGMAN Y, YANG M, RANZATO M, et al. DeepFace: Closing the gap to human-level performance in face verification［C］// IEEE Conference on Computer Vision & Pattern Recognition. 2014:1701-1708.
［21］SCHROFF F, KALENICHENKO D, PHILBIN J. FaceNet: A unified embedding for face recognition and clustering［C］// IEEE Conference on Computer Vision & Pattern Recognition. 2015:815-823.
［22］ZAGORUYKO S, KOMODAKIS N. Learning to compare image patches via convolutional neural networks［C］// Computer Vision and Pattern Recognition. 2015: 4353-4361.
［23］LUO W J, LI Y J, URTASUN R, et al. Understanding the effective receptive field in deep convolutional neural networks［C］// Proceedings of the 30th International Conference on Neural Information Processing Systems. ACM, 2016:4905-4913.
［24］ZHOU B L, KHOSLA A, LAPEDRIZA A, et al. Object detectors emerge in deep scene CNNs［J］. Computer Vision and Pattern Recognition, 2014:arXiv:1412.6856.
［25］KINGMA D P, BA J. Adam: A method for stochastic optimization［J］. Machine Learning, 2014:arXiv:1412.6980.
［26］LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection［C］// 2017 IEEE International Conference on Computer Vision. 2017:2999-3007.
［27］HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3［C］// IEEE/CVF International Conference on Computer Vision (ICCV). 2019:1314-1324.

[1]	朱剑波, 葛明锋, 董文飞. 基于改进EfficientNet的阿尔兹海默症图像分类[J]. 计算机与现代化, 2023, 0(06): 56-61.
[2]	胡肖, 焦立男, 柳有权. 基于改进UpdateNet的行人视觉跟踪算法[J]. 计算机与现代化, 2023, 0(05): 80-85.
[3]	崔晨露, 崔琳, . 面向数据增强的轻量化语音情感识别[J]. 计算机与现代化, 2023, 0(04): 83-89.
[4]	石展鲲, 杨风, 韩建宁, 郭鑫, 曹尚斌. 基于Faster-RCNN的自然环境下苹果识别[J]. 计算机与现代化, 2023, 0(02): 62-65.
[5]	陈卓, 乔贵方, 柴鑫波, 杜一君, 沈重霖, 王远浩. 基于改进知识蒸馏的多天候车辆检测方法[J]. 计算机与现代化, 2023, 0(02): 50-57.
[6]	李健, 张克亮, 唐亮, 夏榕璟, 任静静. 面向中文命名实体识别任务的数据增强[J]. 计算机与现代化, 2022, 0(04): 1-6.
[7]	赵玉蓉, 郭会明, 焦函, 章俊伟. 融合混合域注意力的YOLOv4在船舶检测中的应用[J]. 计算机与现代化, 2021, 0(09): 75-82.
[8]	林智伟, 朱文章, 陈浩. 基于神经网络空时特征提取的动态手势识别[J]. 计算机与现代化, 2021, 0(06): 41-47.
[9]	李程启１,郑文杰1,黄文礼2,温招洋2. 一种基于透视变换数据增广的斜视目标鲁棒检测方法[J]. 计算机与现代化, 2020, 0(04): 1-.
[10]	樊笛,巨志勇. 基于原型网络的小样本图像识别方法[J]. 计算机与现代化, 2020, 0(03): 103-.