计算机与现代化 ›› 2024, Vol. 0 ›› Issue (12): 72-77.doi: 10.3969/j.issn.1006-2475.2024.12.011

• 图像处理 • 上一篇    下一篇

基于注意力的DSMSC的遥感图像场景分类


  

  1. (西安工程大学计算机科学学院,陕西 西安 710048)
  • 出版日期:2024-12-31 发布日期:2024-12-31
  • 基金资助:
    陕西省自然科学基础研究计划一般项目(面上)(2020JM-574)

DSMSC Based on Attention Mechanism for Remote Sensing Image Scene Classification 

  1. (School of Computer Science, Xi’an Polytechnic University, Xi’an 710048, China)
  • Online:2024-12-31 Published:2024-12-31

摘要: 针对遥感影像背景复杂且场景目标尺度信息不同导致模型分类准确度较低的问题,提出一种基于注意力的深度可分离多尺度扩张特征融合网络的遥感图像场景分类模型(Depthwise Separable Multi-scale Dilated Convolution, DSMSC)。首先,该模型基于深度可分离卷积构建特征提取模块,在提取影像深层特征的同时减少参数量;然后,通过多尺度扩张卷积模块增大网络感受野,获取图像的全局特征和关联特征;最后,利用注意力机制使网络关注重要的特征并将其输入到Softmax分类器进行分类。在遥感场景AID和WHU-RS19这2个数据集上进行验证,实验结果表明与AlexNet、VGG-16、ResNet18等模型相比,本文模型的准确率分别提高到93.32%和91.15%,同时具有较低的参数量,对遥感图像场景分类具有一定的应用前景。

关键词: 遥感图像场景分类, 卷积神经网络, 深度可分离卷积, 多尺度, 扩张卷积

Abstract: To address the issue of limited classification accuracy in remote sensing image scene classification, arising from the complex background and varying scales of scene objects, this paper introduces a remote sensing image scene classification model based on a depthwise separable multiscale dilated feature fusion network with an attention mechanism. Firstly, this model employs a feature extraction module built on depthwise separable convolutions, allowing the extraction of deep-level image features while minimizing the parameter count. Subsequently, a multiscale dilated convolution module is used to expand the network’s receptive field, enabling the extraction of both global and contextual features from remote sensing images. Finally, the attention mechanism is used to make the network focus on important features, and the extracted features are input into a Softmax classifier for the purpose of classification. We validate the proposed model on two datasets, AID and WHU-RS19, for remote sensing scene classification. Experimental results demonstrate that, in comparison to baseline models such as AlexNet, VGG-16, and ResNet18, the proposed model achieves an accuracy improvement to 93.32% on AID and 91.15% on WHU-RS19, while maintaining a relatively lower parameter count. The proposed model holds significant theoretical implications for remote sensing image scene classification.

Key words:  , remote sensing image scene classification; convolutional neural networks; depthwise separable convolution; multi-scale; expansion convolution ,

中图分类号: