计算机与现代化

• 数据挖掘 • 上一篇    下一篇

基于深度学习的短视频中的物体检测与内容推荐系统研究

  

  1. (1.华中师范大学物理科学与技术学院,湖北武汉430079;2.百度时代网络技术(北京)有限公司,北京100089)
  • 收稿日期:2018-04-16 出版日期:2018-11-22 发布日期:2018-11-23
  • 作者简介:石殷巧(1993-),女,安徽安庆人,华中师范大学物理科学与技术学院硕士研究生,研究方向:机器学习与软件开发; 刘守印(1964-),男,河南周口人,教授,博士生导师,博士,研究方向:无线通信,物联网与机器学习; 马超(1990-),男,湖北武汉人,百度时代网络技术(北京)有限公司工程师,研究方向:软件开发,机器视觉。

Research on Object Detection and Content Recommendation System in Short Video Based on Deep Learning

  1. (1. College of Physical Science and Technology, Central China Normal University, Wuhan 430079, China;
    2. Baidu.com Times Technology (Beijing) Co., Ltd., Beijing 100089, China)
  • Received:2018-04-16 Online:2018-11-22 Published:2018-11-23

摘要: 近年来短视频发展迅猛,短视频广告投放具有良好的市场前景,但是以往长视频的贴片广告投放方式不适合短视频。本文依据高相关、低打扰、短而精的准则,提出一种基于深度学习的视频物体检测与内容推荐系统方案。根据短视频来源、网络环境等不同,本文介绍2种实现模式:云端模式和移动终端模式。云端模式由服务器、内容分发网络(Content Delivery Network, CDN)和终端组成,服务器可预先对CDN短视频进行物体检测和识别,将短视频与对应广告内容匹配,并在移动端播放。移动终端模式主要处理本地视频,在移动端有限的资源上完成短视频的物体检测和内容推荐。在算法上,移动终端模式下该系统采用深度学习轻量级模型MobileNet以提高检测速度和准确率,降低内存。在实现上,通过联合编译Java和C++代码提高算法运行效率,通过反馈系统减小物体类别数量,提高实时性。

关键词: 深度学习, 物体检测, 内容推荐, Faster R-CNN, MobileNet

Abstract: Short video has been developing rapidly in recent years, and short video advertising has a promising prospect. However, the traditional advertisements are usually stiffly inserted into the videos, which are inefficient and always decrease users’ experience. This thesis proposes a systematic scheme for video object detection and content recommendation based on the deep learning model Faster R-CNN. This scheme will match the video contents to the displayed advertisements based on the principles of high correlation, precision and low interruption, thus obtains a balance between recommendation and user experience. Two system modes are available according to the video sources and network environments, named as Cloud Mode and Mobile Terminal Mode. The Cloud Mode is composed of a server, Content Delivery Network (CDN) and clients. The server will detect and recognize the contents of the CDN videos in advance, match them to corresponding advertisements by some recommendation algorithms and play the contents on the mobile Clients. The Mobile Terminal Mode mainly processes non-CDN resources such as some local videos, completes the tasks of object detection, recognition and content recommendation with limited computation ability. We apply the MobileNet model to improve the detection speed and accuracy, as well as to reduce memory footprint. To further increase efficiency and achieve real-time performance under the Mobile Terminal Mode, we implement joint compilation of Java and C++ code, adopt a self-developed player and cut down the object category based on the feedback system.

Key words: deep learning, object detection, content recommendation, Faster R-CNN, MobileNet

中图分类号: