基于改进YOLOv3网络的学生特定行为识别

doi:10.3969/j.issn.1006-2475.2020.07.018

摘要/Abstract

摘要： 为了提高卷积神经网络在学生行为识别应用的检测精度,本文使用K-means聚类对特有数据集进行聚类得到更适应的anchor box，并且提出一种基于改进损失函数的YOLOv3网络模型。该网络模型将原有的平方和损失函数权重进行动态转化,注重计算连续变量的损失。新的损失函数能有效减低Sigmoid函数梯度消失的影响,使模型收敛更加快速。实验结果表明，基于改进损失函数的深度卷积神经网络应用对“抬头”“低头”“说话”3种姿态的识别均有提高。

关键词: K-means, 图像增强, 损失函数, YOLOv3网络, 姿态识别

Abstract: In order to improve the detection accuracy of convolutional neural networks in student behavior recognition applications, this paper uses K-means clustering to cluster the unique data sets to obtain more adaptive anchor box, and proposes a YOLOv3 network based on improved loss function. The network model dynamically transforms the original squared loss function weights, focusing on the calculation of the loss of continuous variables. The new loss function can effectively reduce the influence of the gradient disappearance of the sigmoid function, making the model converge more quickly. The experimental results show that the deep convolutional neural network based on the improved loss function has improved the recognition of the three poses of “lookup”, “lookdown” and “talk”.

Key words: , K-means； image enhancement； loss function； YOLOv3 network； gesture recognition

中图分类号:

TP311

王春辉, 王全民. 基于改进YOLOv3网络的学生特定行为识别[J]. 计算机与现代化, 2020, 0(07): 90-96.

WANG Chun-hui, WANG Quan-min. Student Specific Behavior Recognition Based on Improved YOLOv3 Network[J]. Computer and Modernization, 2020, 0(07): 90-96.

参考文献

［1］ FUJIYOSHI H, LIPTON A J. Real-time human motion analysis by image skeletonization［C］// Proceedings of the 4th IEEE Workshop on Applications of Computer Vision. IEEE, 2002:15-21.
［2］ WANG H, KLASER A, SCHMID C, et al. Action recognition by dense trajectories［C］// IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 2011:3169-3176.
［3］ GAVRILA D M. Pedestrian detection from a moving vehicle［C］// European Conference on Computer Vision. Springer Berlin Heidelberg, 2000:37-49.
［4］潘锋,王宣银,王全强. 智能监控中基于头肩特征的人体检测方法研究［J］. 浙江大学学报, 2004,38(4):397-401.
［5］ CHAUDHRY R, RAVICHANDRAN A, HAGER G, et al. Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions［C］// IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009:1932-1939.
［6］ BOBICK A, DAVIS J. An appearance-based representation of action［C］// Proceedings of the 13th International Conference on Pattern Recognition. 2006:307-312.
［7］刘菲. 运动人体行为分析系统及关键技术研究［D］. 西安:西安电子科技大学, 2007.
［8］唐勇,姜昱明. 彩色图像序列中运动人体轮廓提取［J］. 计算机工程与设计， 2006,27(20):3901-3903.
［9］ WEINLAND D, RONFARD R, BOYER E. Free viewpoint action recognition using motion history volumes［J］. Computer Vision and Image Understanding, 2006,104(2-3):249-257.
［10］LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition［J］. Proceedings of the IEEE, 1998,86(11):2278-2324.
［11］LECUN Y, BOSE B, DENKER J S, et al. Handwritten digit recognition with a back-propagation network［C］// Advances in Neural Information Processing Systems. 1990:396-404.
［12］ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks［C］// European Conference on Computer Vision(ECCV). 2014,8689:818-833.
［13］SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition［J］. Computer Vision and Pattern Recognition, 2015:arXiv:1409.1556.
［14］SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions［C］// IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2014: DOI: 10.1109/CVPR.2015.7298594.
［15］WANG T, WU D J, COATES A, et al. End-to-end text recognition with convolutional neural networks［C］// International Conference on Pattern Recognition(ICPR). 2012:3304-3308.
［16］GIRSHICK R. Fast R-CNN［C］// Proceedings of 2015 IEEE International Conference on Computer Vision. IEEE, 2015:1440-1448.
［17］REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks［C］// Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015:91-99.
［18］DAI J F, LI Y, HE K M, et al. R-FCN: Object detection via region-based fully convolutional networks［C］// Proceedings of the 30th International Conference on Neural Information Processing Systems(NIPS). 2016:379-387.
［19］REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection［C］// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016:779-788.
［20］REDMON J, FARHADI A. YOLO9000: Better, faster, stronger［C］// IEEE Conference on Computer Vision and Pattern Recognition(CVPR). IEEE, 2017:690-701.
［21］LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector［C］// Proceedings of the 14th European Conference on Computer Vision. 2016:21-37.
［22］REDMON J, FARHADI A. YOLOv3: An incremental improvement［J］. Computer Vision and Pattern Recognition, 2018:arXiv:1804.02767.
［23］徐頔,朱广华,贾瑶. 基于VueJs的Web前端开发研究［J］. 科技风， 2017(14):69.
［24］旺云飞. JavaEE开发的颠覆者-SpringBoot实战［M］. 北京:电子工业出版社, 2016.
［25］王永和,张劲松,邓安明. SpringBoot研究和应用［J］. 信息通信, 2016(10):91-94.
［26］郑志强,刘妍妍,潘长城,等. 改进YOLOv3遥感图像飞机识别应用［J］. 电光与控制, 2018(4):28-32.
［27］吕铄,蔡烜,冯瑞. 基于改进损失函数的YOLOv3网［J］. 计算机系统应用, 2019,28(2):1-7.
［28］LIU Z C， WANG S. Broken corn detection based on an adjusted yolo with focal loss［J］. Digital Object Identifier, 2019,7:68281-68289.

[1]	陈凯1, 李宜汀1, 2, 全华凤1 . 基于改进YOLOv8的河道废弃瓶检测方法[J]. 计算机与现代化, 2024, 0(11): 113-120.
[2]	韩瑞超, 孟令军, 敖利丞, 谢宇斌, 甄明硕. 基于改进YOLOv5的施工防护佩戴检测[J]. 计算机与现代化, 2024, 0(10): 49-54.
[3]	杜菲瑀, 王海燕, 姚海洋, 陈晓. 基于领域自适应的水下图像增强算法[J]. 计算机与现代化, 2024, 0(10): 55-60.
[4]	王涛1, 2, 黄丹1, 2, 刘禅奕1, 2, 朱桃1, 2. 基于YOLOv5s的无人机图像车辆检测[J]. 计算机与现代化, 2024, 0(08): 108-113.
[5]	秦阳, 詹勇, 明路遥, 杨舒淇, 蓝振祎. 基于改进K-means算法的通勤交通小区识别[J]. 计算机与现代化, 2024, 0(07): 63-68.
[6]	钟海龙1, 2, 何月顺1, 何璘琳1, 陈杰1, 田鸣3, 郑瑞银4. 基于代价敏感卷积神经网络的加密流量分类#br# #br#[J]. 计算机与现代化, 2024, 0(05): 55-60.
[7]	贾子煜1, 黄欢1, 胡春艾2, 窦丽娜2. 基于MR的左心房纤维化区域分割与重建[J]. 计算机与现代化, 2024, 0(05): 75-79.
[8]	孟雅蕾1, 师红宇1, 王予2. 一种无阻流量预测方法[J]. 计算机与现代化, 2024, 0(04): 33-37.
[9]	刘馨嫔1, 2, 3, 王洪1, 3, 赵良瑾1, 3. 基于多任务学习的近岸舰船检测方法[J]. 计算机与现代化, 2024, 0(03): 29-33.
[10]	杜韩宇, 魏延, 唐保香, 廖恒锋, 叶思佳. 基于双重注意力残差模块的低照度图像增强[J]. 计算机与现代化, 2024, 0(03): 85-91.
[11]	韩雪. 基于约束聚类和粒子群算法的多路径规划[J]. 计算机与现代化, 2023, 0(08): 7-11.
[12]	张美的, 余顺园. 基于局部自适应伽马校正低照度图像增强[J]. 计算机与现代化, 2023, 0(08): 74-78.
[13]	王艺成, 张国良, 张自杰, . 基于改进YOLOv5的小目标检测方法[J]. 计算机与现代化, 2023, 0(05): 100-105.
[14]	廖聪, 郭凰, 赵茂军, 王雨松, 白俊峰. 基于图像增强和SKNet的交通标志识别[J]. 计算机与现代化, 2023, 0(03): 23-28.
[15]	盛江岸, 陈淑荣. 融合双重注意力机制的戴口罩人脸识别方法[J]. 计算机与现代化, 2023, 0(02): 72-77.