结合全卷积网络的无监督视频目标分割

doi:10.3969/j.issn.1006-2475.2019.06.020

计算机与现代化 ›› 2019, Vol. 0 ›› Issue (06): 116-.doi: 10.3969/j.issn.1006-2475.2019.06.020

结合全卷积网络的无监督视频目标分割

(中国石油大学（华东）计算机与通信工程学院,山东青岛266580)

收稿日期:2018-12-13 出版日期:2019-06-14 发布日期:2019-06-14
作者简介:霍达(1995-),男,山东临沂人,硕士研究生,研究方向:计算机图形与图像处理,E-mail: 1457322657@qq.com; 马诗源(1995-),男,辽宁丹东人,硕士研究生,研究方向:计算机视觉。
基金资助:
国家自然科学基金资助项目(61379106,61379082,61227802); 山东省自然科学基金资助项目(ZR2013FM036,ZR2015FM011)

Unsupervised Video Object Segmentation with Fully Convolutional Network

(College of Computer and Communication Engineering, China University of Petroleum, Qingdao 266580, China)

Received:2018-12-13 Online:2019-06-14 Published:2019-06-14

摘要/Abstract

摘要： 对视频中的目标进行像素级分割是计算机视觉领域的研究热点，完全没有用户标注的无监督视频分割对分割算法提出了更高的要求。近几年在分割中常使用基于帧间运动信息进行建模的方法，即用光流等运动信息预测目标轮廓，再结合颜色等特征建立模型进行分割。针对这些方法产生的前景背景混淆以及边缘粗糙等问题，本文提出结合全卷积网络的视频目标分割方法。首先通过全卷积网络预测视频序列中显著目标的轮廓，结合光流获得的运动显著性标签进行修正，然后建立时间-空间图模型，运用图割的方法获得最终的预测标签。在SegTrack v2以及DAVIS这2个通用数据集上进行评估，结果表明本文方法较基于帧间运动信息的方法在分割效果上有明显的提高。

关键词: 视频分割, 目标分割, 深度特征, 无监督, 全卷积网络

Abstract: Pixel-level object segmentation in videos is a research hotspot in the field of computer vision. Unsupervised video segmentation without user annotation imposes higher requirements on segmentation algorithms. In recent years, the modeling methods based on inter-frame motion information are often used, that is, the motion information such as optical flow is used to predict the target contour, and the model is built based on features such as color for segmentation. Concerning the problems such as confusion of foreground and background and the rough edges caused by these methods, this paper proposes a video object segmentation method that combines fully convolutional neural network features. Firstly, the contour of the salient object in the video sequence is predicted through fully convolutional network and modified combining with motion saliency label obtained by optical flow. Then a time-space diagram model is established, the final segmentation result is obtained by using the graph cut method. The proposed method is evaluated on SegTrack v2 and DAVIS general datasets. The results show that the proposed method has better segmentation performance than the method based on inter-frame motion information.

Key words: video segmentation, object segmentation, deep feature, unsupervised, fully convolutional network

中图分类号:

TP391.41

霍达,马诗源. 结合全卷积网络的无监督视频目标分割[J]. 计算机与现代化, 2019, 0(06): 116-.

HUO Da, MA Shi-yuan. Unsupervised Video Object Segmentation with Fully Convolutional Network[J]. Computer and Modernization, 2019, 0(06): 116-.

参考文献

［1］ PAPAZOGLOU A, FERRARI V. Fast object segmentation in unconstrained video［C］// Proceedings of 2013 IEEE International Conference on Computer Vision. 2013:1777-1784.
［2］ KEUPER M, ANDRES B, BROX T. Motion trajectory segmentation via minimum cost multicuts［C］// Proceedings of 2015 IEEE International Conference on Computer Vision. 2015:3271-3279.
［3］ OCHS P, BROX T. Object segmentation in video: A hierarchical variational approach for turning point trajectories into dense regions［C］// Proceedings of 2011 International Conference on Computer Vision. 2011:1583-1590.
［4］ TSAI Y H, YANG M H, BLACK M J. Video segmentation via object flow［C］// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:3899-3908.
［5］ AYVACI A, SOATTO S. Detachable object detection: Segmentation and depth ordering from short-baseline video［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012,34(10):1942-1951.
［6］ TAYLOR B, KARASEV V, SOATTO S. Causal video object segmentation from persistence of occlusions［C］// 2015 IEEE Conference on Computer Vision and Pattern Recognition. 2015:4268-4276.
［7］ XIAO F Y, LEE Y J. Track and segment: An iterative unsupervised approach for video object proposals［C］// 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:933-942.
［8］ LEE Y J, KIM J, GRAUMAN K. Key-segments for video object segmentation［C］// Proceedings of 2011 International Conference on Computer Vision. 2011:1995-2002.
［9］ HARIHARAN B, ARBELAEZ P, BOURDEV L, et al. Semantic contours from inverse detectors［C］// Proceedings of 2011 International Conference on Computer Vision. 2011:991-998.
［10］WEINZAEPFEL P, REVAUD J, HARCHAOUI Z, et al. Learning to detect motion boundaries［C］// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. 2015:2578-2586.
［11］CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: Semantic image segmentation with deep convolutional nets and fully connected CRFs［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015,40(4):834-848.
［12］JAIN S D, XIONG B, GRAUMAN K. Pixel Objectness ［DB/OL］. (2017-04-12)［2018-12-13］. https://arxiv.org/pdf/1701.05349.pdf.
［13］SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-scale Image Recognition［DB/OL］. (2015-04-10)［2018-12-13］.https://arxiv.org/pdf/1409.1556v6.pdf.
［14］EVERINGHAM M, WINN J. The Pascal visual object classes (VOC) challenge［J］. International Journal of Computer Vision, 2010,88(2):303-338.
［15］LIU C, FREEMAN W T, ADELSON E H. Beyond pixels: Exploring new representations and applications for motion analysis［C］// Proceedings of 2009 European Conference on Computer Vision. 2009:28-42.
［16］BOYKOV Y, KOLMOGOROV V. An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004,26(9):1124-1137.
［17］TSAI D, FLAGG M, NAKAZAWA A, et al. Motion coherent tracking with multi-label MRF optimization［J］. International Journal of Computer Vision, 2012,100(2):190-202.
［18］PERAZZI F, PONT-TUSET J, MCWILLIAMS B, et al. A benchmark dataset and evaluation methodology for video object segmentation［C］//2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:724-732.
［19］FAKTOR A, IRANI M. Video segmentation by non-local consensus voting［C］// Proceedings of 2014 British Machine Vision Conference. 2014:1428-1443.
［20］WANG W G, SHEN J B, PORIKLI F. Saliency-aware geodesic video object segmentation［C］// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. 2015:3395-3402.
［21］PERAZZI F, WANG O, GROSS M, et al. Fully connected object proposals for video segmentation［C］// Proceedings of 2015 IEEE International Conference on Computer Vision. 2015:3227-3234.
［22］RAMAKANTH S A, BABU R V. Seamseg: Video object segmentation using patch seams［C］// Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. 2014:376-383.
［23］GRUNDMANN M, KWATRA V, HAN M, et al. Efficient hierarchical graph-based video segmentation［C］// Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2010:2141-2148.

[1]	胡崇佳, 刘金洲, 方立. 基于无监督域适应的室外点云语义分割[J]. 计算机与现代化, 2024, 0(01): 74-79.
[2]	刘路瑶, 韩培胜. 基于堆叠降噪自编码器的跨项目软件缺陷数量预测方法[J]. 计算机与现代化, 2023, 0(04): 32-38.
[3]	焦新泉, 李睿康, 陈建军. 基于改进的MoCo的遥感图像目标检测[J]. 计算机与现代化, 2022, 0(12): 88-94.
[4]	朱大庆, 曹国. 基于全卷积网络的砂石图像粒径检测[J]. 计算机与现代化, 2020, 0(07): 111-116.
[5]	梅文欣1,2,林志贤1,2,郭太良1,2. 强化深度特征融合的行人搜索系统[J]. 计算机与现代化, 2019, 0(08): 23-.
[6]	房靖晶,成金勇. 基于Faster R-CNN的办公用品目标检测[J]. 计算机与现代化, 2019, 0(01): 40-.
[7]	沈茂东1，周伟1，宋晓东1，邓昊1，马超1，薛冰2，张卫山2. 基于RPN和FCN的电力设备锈迹检测[J]. 计算机与现代化, 2018, 0(12): 96-.
[8]	武文雅,陈钰枫,徐金安,张玉洁. 中文实体关系抽取研究综述[J]. 计算机与现代化, 2018, 0(08): 21-.
[9]	戴海能,茅耀斌. 一种改进的基于R-FCN模型的人脸检测算法[J]. 计算机与现代化, 2018, 0(08): 12-.
[10]	杨志尧1,2，彭召意1,2，文志强1,2. 一种基于区域建议网络的图像语义分割方法[J]. 计算机与现代化, 2018, 0(02): 122-.
[11]	吴非，毛宇光. 一种基于k维树的模糊C均值聚类算法[J]. 计算机与现代化, 2015, 0(11): 1-5+11.
[12]	宁波，宋砚. 基于无监督方法的视频中的人物识别[J]. 计算机与现代化, 2014, 0(12): 49-53.
[13]	刘雪娜;侯宝明. 变权重MRF算法在图像自动无监督分割中的应用[J]. 计算机与现代化, 2012, 1(11): 78-80+1.

结合全卷积网络的无监督视频目标分割

Unsupervised Video Object Segmentation with Fully Convolutional Network

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 13

编辑推荐

Metrics

本文评价