计算机与现代化 ›› 2024, Vol. 0 ›› Issue (08): 59-66.doi: 10.3969/j.issn.1006-2475.2024.08.011

• 人工智能 • 上一篇    下一篇

基于改进YOLOX和新型数据关联方式的无人机#br# 多目标跟踪方法



  

  1. (1.中国科学院空天信息创新研究院,北京 100094; 2.中国科学院空间信息处理与应用系统技术重点实验室,北京 100190;
    3.中国科学院大学电子电气与通信工程学院,北京 101408)
  • 出版日期:2024-08-28 发布日期:2024-08-28

Multi-object Tracking of UAV Based on Improved YOLOX and New Data Association Method

  1. (1. Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China;2. Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, Chinese Academy of Sciences, Beijing 100190, China;3. School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 101408, China)
  • Online:2024-08-28 Published:2024-08-28

摘要: 无人机视频中的多目标跟踪是一项重要的计算机视觉任务,在各个领域有着广泛的应用。针对无人机视频场景中目标遮挡、小目标、背景复杂多变等挑战,本文提出一种改进的无人机多目标跟踪模型。首先,本文对YOLOX进行改进,将Swin Transformer集成到网络中以增强全局信息提取能力,并增加一个额外的检测头来改善对小目标的检测能力,此外引入CBAM注意力模块来增强聚焦有用信息的能力。数据关联阶段,本文采用一种新型数据关联方式,保留所有检测框,并根据置信度将其划分为高分检测框和低分检测框,对高分检测框与跟踪轨迹进行第一次关联,将未匹配轨迹与低分检测框进行二次关联。在公开数据集VisDrone2021和UAVDT上的实验结果表明,本文方法在无人机多目标跟踪场景中具有较高的优越性和鲁棒性。

关键词: 多目标跟踪, 无人机视频, 注意力机制, 数据关联

Abstract:  Multi-object tracking in UAV videos is a crucial computer vision task with extensive applications across various domains. To address the challenges of occlusions, small objects, and complex, varying backgrounds in UAV video scenes, an improved UAV multi-object tracking model is proposed. This paper improves the YOLOX network by integrating the Swin Transformer to enhance global information extraction capabilities and adding an additional detection head to boost the detection performance of small objects. Furthermore, this paper introduces the CBAM attention module to focus on informative features. In the data association stage, this paper adopts a new data association approach that retains all detection boxes, categorizing them into high-scoring and low-scoring detection boxes based on their confidence scores. The first association is performed between high-scoring detection boxes and tracking trajectories, while the second association is performed between unmatched trajectories and low-scoring detection boxes. Experimental results on the public datasets VisDrone2021 and UAVDT demonstrate that the proposed method exhibits relatively high superiority and robustness in UAV multi-object tracking scenarios.

Key words: multi-object tracking, unmanned aerial vehicles(UAV) videos, attention mechanism, data association

中图分类号: