计算机与现代化 ›› 2023, Vol. 0 ›› Issue (07): 79-85.doi: 10.3969/j.issn.1006-2475.2023.07.014

• 图像处理 • 上一篇    下一篇

基于CNN-Transformer混合结构的遥感影像变化检测模型

  

  1. (1.陕西科技大学陕西省人工智能联合实验室,陕西 西安 710021; 2.中电科西北集团有限公司西安分公司,陕西 西安 710065;3.西安邮电大学计算机科学与技术学院,陕西 西安 710121)
  • 出版日期:2023-07-26 发布日期:2023-07-27
  • 作者简介:许叶彤(1999—),女,陕西武功人,硕士研究生,研究方向:计算机视觉,E-mail: xuyetong1999@163.com; 耿信哲(1997—),男,硕士研究生, 研究方向:图像处理,模式识别,E-mail: 201606020514@sust.edu.cn; 赵伟强(1990—),男,硕士, 研究方向:移动机器人控制,导航,E-mail: zhao_scrat@163.com; 张月(1998—)女,硕士研究生,研究方向:深度学习,图像处理,E-mail: zhangyue9815@163.com; 宁海龙(1993—),男,副教授, 研究方向:模式识别,机器学习,计算机视觉和多模态学习,E-mail: ninghailong93@gmail.com; 通信作者:雷涛(1981—),男,教授,研究方向:人工智能,计算机视觉,机器学习,E-mail: xyt15591836230@163.com。
  • 基金资助:
    国家自然科学基金资助项目(61871259); 陕西省重点研究开发项目(2021ZDLGY08-07, 2022GY-436); 陕西省创新能力支撑计划项目( 2020SS-03)

A Remote Sensing Image Change Detection Model Based on CNN-Transformer Hybrid Structure

  1. (1. Shaanxi Joint Laboratory of Artificial Intelligence, Shaanxi University of Science and Technology, Xi ’an 710021, China; 
    2. Xi’an Branch, Northwest Group Corporation,  China Electronics Technology Group Corporation, Xi’an 710065, China; 
    3. School of Computer Science and Technology, Xi'’an University of Posts and Telecommunications, Xi’an 710121, China)
  • Online:2023-07-26 Published:2023-07-27

摘要: 卷积神经网络和Transformer模型的出现,使得遥感影像变化检测技术不断进步,但是目前这2种方法仍存在不足:一方面,卷积神经网络由于其卷积核局部感知的特点无法对遥感影像进行全局信息建模;另一方面,Transformer虽然可以捕获遥感影像的全局信息,但是对影像变化的细节信息不能很好地建模,且其计算复杂度随图像的分辨率呈二次方增长。为了解决上述问题,获得更稳健的变化检测结果,本文提出一种基于卷积神经网络和Transformer混合结构的变化检测模型(CNN-Transformer Change Detection Network, CTCD-Net)。首先,CTCD-Net串联使用卷积神经网络和基于Transformer编解码结构来有效地编码遥感影像的局部特征和全局特征,从而提升网络的特征学习能力。其次,提出跨通道的Transformer自注意力模块(CSA)和注意力前馈网络(A-FFN),有效地降低了Transformer的计算复杂度。在LEVIR-CD和CDD数据集上进行了充分的实验,实验结果表明,CTCD-Net的检测精确度显著优于目前其他主流方法。

关键词: 遥感图像, 变化检测, 卷积神经网络, Transformer

Abstract: The emergence of convolutional neural network and Transformer model has made continuous progress in remote sensing image change detection technology, but at present, these two methods still have shortcomings. On the one hand, the convolutional neural network cannot model the global information of remote sensing images due to its local perception of convolution kernel. On the other hand, although Transformer can capture the global information of remote sensing images, it cannot model the details of image changes well, and its computational complexity increases quadrally with the resolution of images. In order to solve the above problems and obtain more robust change detection results, this paper proposes a CNN-Transformer Change Detection Network (CTCD-Net) based on convolutional neural network and Transformer hybrid structure. Firstly, CTCD-Net uses convolutional neural network and Transformer based on encoding and decoding structure in series to effectively encode local and global features of remote sensing images, so as to improve the feature learning ability of the network. Secondly, the cross-channel Transformer self-attention module (CSA) and attention feedforward network (A-FFN) are proposed to effectively reduce the computational complexity of Transformer. Full experiments on LEVIR-CD and CDD datasets show that the detection accuracy of CTCD-Net is significantly better than that of other mainstream methods.

Key words: remote sensing images, change detection, convolutional neural networks, Transformer

中图分类号: