计算机与现代化 ›› 2024, Vol. 0 ›› Issue (05): 92-98.doi: 10.3969/j.issn.1006-2475.2024.05.016

• 图像处理 • 上一篇    下一篇

结合局部自注意力和深度优化的多视图重建

  



  1. (东华理工大学信息工程学院,江西 南昌 330013)
  • 出版日期:2024-05-29 发布日期:2024-06-12
  • 作者简介: 作者简介:叶森辉(1996—),男,江西赣州人,硕士研究生,研究方向:图像三维重建,E-mail:1845407234@qq.com; 通信作者:王蕾(1979—),女,湖北黄陂人,教授,博士,研究方向:三维点云,三维重建,E-mail: wlei598@163.com。 文章编号:
  • 基金资助:
    国家自然科学基金资助项目(42001411); 江西省核地学数据科学与系统工程技术研究中心基金资助项目(JELRGBDT202202); 江西省放射性地学大数据技术工程实验室开放基金资助项目(JELRGBDT202103)
        

Multi-view Reconstruction with Local Self-attention and Deep Optimization



  1. (School of Information Engineering, East China University of Technology, Nanchang 330013, China)
  • Online:2024-05-29 Published:2024-06-12

摘要:
摘要:针对多视图三维重建中存在的内存和时间消耗过大、高分辨率重建完整性差等问题,提出一种基于深度学习的多视图重建网络。网络由特征提取模块、级联的Patchmatch模块和深度图优化模块组成。首先,设计U型的特征提取模块,提取多阶段特征图,并在每个阶段引入相对位置编码的局部自注意力层,捕捉图像中的局部细节和全局上下文,提升网络特征提取性能。其次,设计深度残差网络,通过密集连接和残差结构对特征进行融合,充分利用彩色图像先验知识来约束深度图,提升深度估计的准确性。在公开数据集DTU(Technical University of Denmark)上进行测试,实验结果表明,三维重建质量到了有效的提升,与PatchmatchNet相比在完整性上提升了6.1%,在整体性上提升了2.5%,与其他的 SOTA(State-Of-The-Art)方法相比,在完整性和整体性上都得到了较大提升。



关键词: 关键词:深度学习, 三维重建, 局部自注意力, 多视图立体, 深度估计

Abstract: Abstract: To address the issues of high memory and time consumption, low completeness and fidelity of high-resolution reconstruction in multi-view 3D reconstruction, we propose a deep learning-based multi-view reconstruction network. The network consists of a feature extraction module, a cascaded Patchmatch module and a depth map optimization module. First, we design a U-shaped feature extraction module to extract multi-stage feature maps, and introduce local self-attention layers with relative position encoding at each stage, which capture the local details and global context in the images, and enhance the feature extraction performance of the network. Second, we design a deep residual network to fuse the features, and fully utilize the color image prior knowledge to constrain the depth map, and improve the accuracy of depth estimation. We test our network on the public dataset DTU (Technical University of Denmark), and the experimental results show that our network achieves significant improvement in 3D reconstruction quality. Compared with PatchmatchNet, our network improves the completeness by 6.1% and the overall by 2.5%. Compared with other SOTA (State-Of-The-Art) methods, our network also achieves better performance in both completeness and overall.

Key words: Key words: deep learning, 3D reconstruction, local self-attention mechanism, multi-view stereo, depth estimation

中图分类号: