计算机与现代化 ›› 2022, Vol. 0 ›› Issue (06): 109-115.

• 图像处理 • 上一篇    下一篇

基于软硬件协同加速框架的遥感图像目标检测

  

  1. (1.西安电子科技大学,陕西西安710071;2.陕西航天技术应用研究院有限公司,陕西西安710100)
  • 出版日期:2022-06-23 发布日期:2022-06-23
  • 作者简介:谭金林(1982—),男,陕西西安人,高级工程师,硕士,研究方向:航天电子信息系统,遥感图像处理,E-mail: tjl.king@163.com; 范文童(1996—),男,安徽芜湖人,硕士研究生,研究方向:遥感图像处理,目标检测,FPGA设计,E-mail:slgwanmiao@sina.com; 刘亚虎(1986—),男,工程师,硕士,研究方向:遥感信息处理,E-mail: 15029202330@139.com.
  • 基金资助:
    陕西省自然科学基金青年基金资助项目(2019JQ270)

Object Detection in Remote Sensing Images Based on Software and Hardware Co-acceleration Framework

  1. (1. Xidian University, Xi’an 710071, China;
    2. Shaanxi Aerospace Technology Application Research Institute Co., Ltd., Xi’an 710100, China)
  • Online:2022-06-23 Published:2022-06-23

摘要: 由于遥感图像目标检测模型计算复杂度和内存需求的急剧增加,难以应用在小尺寸和低功耗的嵌入式平台上。针对上述问题,本文提出一种基于现场可编程门阵列(Field-Programmable Gate Array, FPGA)的软硬件协同加速框架,实现遥感图像目标检测模型的推理加速。首先,遵循Vitis AI加速方案对训练后的YOLOv3网络参数进行压缩、编译;其次,在FPGA端搭建包含深度学习处理单元(Deep-Learning Processing Unit, DPU)模块的底层硬件工程,并在ARM上编写DPU任务调度程序;最后,在Zynq SoC开发平台上实现FPGA的推理加速。实验结果表明,该框架在Xilinx-Zynq-MPSoC上的平均吞吐率为1.75 TOPs(26.8 fps),并且在DIOR数据集上的平均精度(mean Average Precision, mAP)为56.7%。

关键词: 遥感图像, 目标检测, 卷积神经网络, 现场可编程门阵列

Abstract: Due to the rapid increase of computational complexity and memory requirement in the field of object detection in remote sensing images, it is quite difficult to be applied to the embedded platform with small size and low power. To address aforementioned issues, a hardware and software co-acceleration framework based on field-programmable gate array (FPGA) to promote the inference process of object detection in remote sensing images is proposed. Firstly, the trained YOLOv3 network are compressed and compiled according to the Vitis AI acceleration scheme. And then, the underlying hardware project including deep learning processing unit (DPU) module is built on FPGA, and the DPU task scheduler is written on ARM. Finally, the inference acceleration based on FPGA is implemented on Zynq SoC development platform. Experimental results show that our framework achieves an average throughput rate of 1.75 TOPS (26.8 fps) on the Xilinx Zynq MPSoC, and the mean Average Precision (mAP) on DIOR dataset is 56.7%.

Key words: remote sensing images, object detection, convolutional neural network, field programmable gate array