计算机与现代化 ›› 2025, Vol. 0 ›› Issue (04): 89-95.doi: 10.3969/j.issn.1006-2475.2025.04.014

• 图像处理 • 上一篇    下一篇

基于孪生轴向注意力与双鉴别器生成对抗网络的红外可见光图像融合


  


  1. (1.沈阳化工大学信息工程学院,辽宁 沈阳 110142; 2.中国科学院沈阳自动化研究所,辽宁 沈阳 110016)
  • 出版日期:2025-04-30 发布日期:2025-04-30
  • 基金资助:
    辽宁省自然科学基金资助项目(2023010411-JH3/101)

Infrared and Visible Image Fusion Based on Twin Axial-attention and Dual-discriminator Generative Adversarial Network

  1. (1. School of Information Engineering, Shenyang University of Chemical Technology, Shenyang 110142, China;
    2. Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China)
  • Online:2025-04-30 Published:2025-04-30

摘要: 红外与可见光融合图像能够同时保留同一场景的前景目标热辐射信息和背景纹理细节,使得描述更加全面和准确。然而深度学习领域的许多经典融合算法通常存在着信息保留不充分、特征融合不均衡的缺陷。针对上述问题,本文提出一种基于孪生轴向注意力与双鉴别器生成对抗网络的图像融合算法。生成器使用双密集卷积网络作为多尺度特征的提取器,引入空间增强分支和孪生轴向注意力捕获局部信息和长程依赖关系。双鉴别器分别与生成器构建对抗博弈关系,通过约束2种源图像与融合图像的相似性来平衡差异化特征的保留程度。此外,基于预训练VGG19的感知损失函数能够克服语义级等高层次特征丢失问题。在TNO数据集上进行实验,结果表明所提方法融合结果目标突出、纹理清晰,相较于其他经典算法,在主观和客观评价指标上都有较为显著的提升,具有一定的先进性。

关键词: 图像融合, 生成对抗网络, 轴向注意力, 双鉴别器, 密集连接网络

Abstract: For the same scene, the fused image of infrared and visible can preserve the thermal radiation information of the foreground target and the background texture details at the same time, and the description is more comprehensive and accurate. However, many classical fusion algorithms based on deep learning usually have the defects of insufficient information retention and unbalanced feature fusion. To solve these problems, an image fusion algorithm based on twin axial-attention and dual-discriminator generating adversarial network is proposed. The generator uses a double-dense convolutional network as a multi-scale feature extractor and introduces spatially enhanced branch and twin axial attention to capture local information and long-range dependencies. The adversarial game between the dual discriminator and the generator is constructed, and the retention degree of differential features is balanced by restricting the similarity between the two source images and the fusion image. The perceptual loss function based on pre-trained VGG19 can overcome the problem of losing high-level features such as semantic-level features. The experimental results on the TNO dataset show that the proposed method achieves prominent fusion results with clear textures and has significant improvements in both subjective and objective evaluation metrics compared to other classical algorithms, demonstrating its advancement.

Key words:  , image fusion, generative adversarial networks, axial-attention module, dual-discriminators, DenseNet

中图分类号: