计算机与现代化 ›› 2024, Vol. 0 ›› Issue (02): 64-68.doi: 10.3969/j.issn.1006-2475.2024.02.010

• 图像处理 • 上一篇    下一篇

基于反向残差注意力的光流估计

  


  1. (1.广东工业大学计算机学院,广东 广州 510006; 2.贵州大学省部共建公共大数据国家重点实验室,贵州 贵阳 550025)
  • 出版日期:2024-02-19 发布日期:2024-03-19
  • 作者简介: 作者简介:梁建业(1996—),男,广东肇庆人,硕士研究生,研究方向:计算机视觉,光流估计,E-mail: 2112005183@mail2.gdut.edu.cn; 通信作者:陈俊洪(1995—),男,博士研究生,研究方向:机器人,计算机视觉,E-mail: CSChenjunhong@hotmail.com; 方桂标(1996—),男,硕士研究生,研究方向:机器人,计算机视觉,E-mail: 2112005040@mail2.gdut.edu.cn。
  • 基金资助:
    国家自然科学基金资助项目(91748107, 62076073, 61902077); 广东省引进创新科研团队计划项目(2014ZT05G157);
    广东省基础与应用基础研究基金资助项目(2020A1515010616); 广东省科技创新战略专项资金资助项目(pdjh2020a0173)

Optical Flow Estimation Based on Inverse Residual Attention

  1. (1. School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China;
    2. State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China)
  • Online:2024-02-19 Published:2024-03-19

摘要:
摘要:光流估计是视频理解和分析的一项基本任务。现有的许多方法直接将遮挡作为异常点剔除,从而提高模型计算光流的能力,但这也容易引起图像灰度不连续,导致光流估计失败。此外,物体高速运动造成的大位移问题一直是光流估计的难点。为了解决上述问题,本文提出一种用于光流估计的基于反向残差注意力的生成对抗学习框架(FlowTranGAN, FTGAN)。该框架通过设计一个反向残差注意力模块增强特征的空间信息,提高像素之间的匹配程度;并且利用基于U-Net的鉴别器来约束生成器,减少光流估计的误差和不连续性,提高模型的泛化能力。通过在KITTI-2015数据集和MPI-Sintel数据集上进行的实验,实验结果表明本文所提出FTGAN的有效性和优越性。

关键词: 关键词:光流估计, 反向残差注意力, 生成对抗学习, 有监督学习

Abstract: Abstract: Optical flow estimation is a basic task of video understanding and analysis. Many existing methods directly take occlusion as the outer point and eliminate it, so as to improve the ability of the model to calculate the optical flow, but it is also easy to cause the image gray discontinuity, leading to the failure of optical flow estimation. In addition, the problem of large displacement caused by high speed motion of objects has always been a difficulty in optical flow estimation. In order to solve the above problems, this paper proposes a generative adversarial learning framework based on reverse residual attention (FlowTranGAN, FTGAN) for optical flow estimation. The proposed framework enhances the spatial information of features by designing a reverse residual attention module to improve the matching degree between pixels. Besides, we use a discriminator based on U-Net to constrain the generator to reduce the error and discontinuity of optical flow estimation, and improve the generalization ability of the model. Experiment results on the KITTI-2015 dataset and MPI-Sintel dataset demonstrate the effectiveness and superiority of the proposed FTGAN.

Key words: Key words: optical flow estimation, reverse residual attention, generative adversarial learning, supervised learning

中图分类号: