计算机与现代化 ›› 2020, Vol. 0 ›› Issue (11): 16-22.

• 图像处理 • 上一篇    下一篇

基于多模态数据的目标跟踪算法

  

  1. (河海大学计算机与信息学院,江苏南京211100)
  • 出版日期:2020-12-03 发布日期:2020-12-03
  • 作者简介:周经纬(1997—),男,江苏泰州人,硕士研究生,研究方向:计算机视觉,目标跟踪,E-mail: 513189227@qq.com; 韩立新(1967—),男,博士生导师,研究方向:信息检索,模式识别,数据挖掘,E-mail: lixinhan2002@aliyun.com; 李晓双(1996—),男,硕士研究生,研究方向:计算机视觉,E-mail: 290598477@qq.com。
  • 基金资助:
    中央高校基本科研业务费专项资金资助项目(B200202180)

Target Tracking Algorithm Based on Multimodal Data

  1. (College of Computer and Information, Hohai University, Nanjing 211100, China)
  • Online:2020-12-03 Published:2020-12-03

摘要: 为解决目标跟踪中目标遮挡、背景复杂等问题,提出一种基于多模态数据的目标跟踪算法。首先对各个模态数据进行像素级融合,以减少单模态数据中信息不足对跟踪结果的影响。然后对融合后的图像提取不同的特征进行滤波,接着将滤波得到的响应图进行决策级融合,以解决因单个模型漂移导致的模型跟踪失败问题。最后根据融合后的响应图的峰值得到跟踪结果。此外,在跟踪过程中加入遮挡检测模块,进一步增强模型鲁棒性。在普林斯顿跟踪基准上对算法进行评估,结果表明,与其他主流算法相比,基于多模态数据的目标跟踪算法在目标遮挡类视频上跟踪精度提升了8.4%,重合成功率提升了3.3%,具有较好的抗遮挡效果。

关键词: 计算机视觉, 目标跟踪, 相关滤波, 模态融合, 遮挡检测

Abstract: In order to solve the problems of target occlusion and complex background in target tracking, a target tracking algorithm based on multimodal data is proposed. First, pixel-level fusion of each modal data is performed to reduce the impact of insufficient information in single-modal data on the tracking results. Then, different features are extracted and filtered from the fused image. At the same time, in order to solve the problem of model tracking failure caused by a single model drift, the response graph obtained by filtering is merged at the decision level. Finally, the tracking result is obtained according to the peak value of the fused response graph. In addition, an occlusion detection module is added in the tracking process to enhance the model robustness. The evaluation of the algorithm on the Princeton tracking benchmark shows that, compared with other mainstream algorithms, the target tracking algorithm based on multimodal data improves the tracking accuracy on target occlusion videos by 8.4% and the coincidence success rate by 3.3%. It has a good anti-occlusion effect.

Key words: computer vision, object tracking, correlation filter, multimodal fusion, occlusion detecting