计算机与现代化

• 应用与开发 • 上一篇    下一篇

 结合全卷积网络的无监督视频目标分割

  

  1. (中国石油大学(华东)计算机与通信工程学院,山东青岛266580)
  • 收稿日期:2018-12-13 出版日期:2019-06-14 发布日期:2019-06-14
  • 作者简介:霍达(1995-),男,山东临沂人,硕士研究生,研究方向:计算机图形与图像处理,E-mail: 1457322657@qq.com; 马诗源(1995-),男,辽宁丹东人,硕士研究生,研究方向:计算机视觉。
  • 基金资助:
    国家自然科学基金资助项目(61379106,61379082,61227802); 山东省自然科学基金资助项目(ZR2013FM036,ZR2015FM011)

Unsupervised Video Object Segmentation with Fully Convolutional Network

  1. (College of Computer and Communication Engineering, China University of Petroleum, Qingdao 266580, China)
  • Received:2018-12-13 Online:2019-06-14 Published:2019-06-14

摘要: 对视频中的目标进行像素级分割是计算机视觉领域的研究热点,完全没有用户标注的无监督视频分割对分割算法提出了更高的要求。近几年在分割中常使用基于帧间运动信息进行建模的方法,即用光流等运动信息预测目标轮廓,再结合颜色等特征建立模型进行分割。针对这些方法产生的前景背景混淆以及边缘粗糙等问题,本文提出结合全卷积网络的视频目标分割方法。首先通过全卷积网络预测视频序列中显著目标的轮廓,结合光流获得的运动显著性标签进行修正,然后建立时间-空间图模型,运用图割的方法获得最终的预测标签。在SegTrack v2以及DAVIS这2个通用数据集上进行评估,结果表明本文方法较基于帧间运动信息的方法在分割效果上有明显的提高。

关键词: 视频分割, 目标分割, 深度特征, 无监督, 全卷积网络

Abstract: Pixel-level object segmentation in videos is a research hotspot in the field of computer vision. Unsupervised video segmentation without user annotation imposes higher requirements on segmentation algorithms. In recent years, the modeling methods based on inter-frame motion information are often used, that is, the motion information such as optical flow is used to predict the target contour, and the model is built based on features such as color for segmentation. Concerning the problems such as confusion of foreground and background and the rough edges caused by these methods, this paper proposes a video object segmentation method that combines fully convolutional neural network features. Firstly, the contour of the salient object in the video sequence is predicted through fully convolutional network and modified combining with motion saliency label obtained by optical flow. Then a time-space diagram model is established, the final segmentation result is obtained by using the graph cut method. The proposed method is evaluated on SegTrack v2 and DAVIS general datasets. The results show that the proposed method has better segmentation performance than the method based on inter-frame motion information.

Key words: video segmentation, object segmentation, deep feature, unsupervised, fully convolutional network

中图分类号: