Computer and Modernization ›› 2023, Vol. 0 ›› Issue (06): 76-81.doi: 10.3969/j.issn.1006-2475.2023.06.013

• IMAGE PROCESSING • Previous Articles     Next Articles

Monocular Depth Estimation Method by Aggregating Multi-dimensional Attention Features

LIU Jia-jia1, HU Xu-xin2, YU Ping2   

  1. 1. Shenzhen Power Supply Planning and Design Institute Co., Ltd, Shenzhen 518000, China;
    2. School of Electrical and Electronic Engineering, North China Electric Power University, Baoding 071000, China
  • Received:2022-06-06 Revised:2022-07-17 Online:2023-06-28 Published:2023-06-28

Abstract: This study is outlined to improve the precision for predicting monocular depth estimation networks and provides an in-depth analysis of the effects of multidimensional attention mechanisms on monocular depth estimation networks. The conclusions and observations are used to design a set of optimized channel and space attention blocks. Considering the convolutional neural network framework obtained based on the local plane guidance layer, a new network structure is created to fully activate the multidimensional attention mechanism through a method that is based on placing different design blocks. Furthermore, in combination with the above two measures for improvement, this study proposes a high-performance monocular depth estimation network that integrates channel and space attention features. On the KITTI Depth dataset and an NYU Depth V2 dataset, the outcomes of this study prove the effectiveness of the optimized blocks and the satisfactory performance of the proposed network through experiments. Compared with the convolutional neural network based on the local plane guidance layer, the proposed network is better in processing the overall features of images and more accurate in predicting depth information with several metrics for network evaluation improved to different degrees. The depth maps generated by the proposed network also demonstrated more data associated with the contours and details of objects.

Key words: monocular depth estimation, convolutional neural network, channel attention, spatial attention

CLC Number: