Computer and Modernization ›› 2024, Vol. 0 ›› Issue (05): 16-21.doi: 10.3969/j.issn.1006-2475.2024.05.004

Previous Articles     Next Articles

Layout Analysis Method of Multi-scale Feature Fusion

  

  1. (School of Information Engineering, Chang’an University, Xi’an 710018, China)
  • Online:2024-05-29 Published:2024-06-12

Abstract: Abstract: Aiming at the problems of list and text misclassification, the difficulty of recognizing small-scale text in tables, and the poor preservation of spatial features in the current document layout element analysis, according to bottom-up thinking, the paper proposes a multi-feature fusion layout analysis method based on SegNet network. In this paper, the MSCAN-SE module is introduced into SegNet to solve the problem of low recognition rate of small-scale elements in tables. The strip features in the attention mechanism MSCAN-SE are used to improve the extraction ability of multi-scale features of the model, so that the network can retain feature information of more scales. Aiming at the problem that the features of list elements and text elements are too similar, the receptive field of the network in the feature extraction process is expanded through the dilated convolution and channel attention branch in the attention mechanism MSCAN-SE. The performance of the proposed method is compared with the classical semantic segmentation network through experiments. The results show that the pixel accuracy of the proposed method on the test set of layout analysis is 97.9%, and the mean intersection over union ratio is 91.7%. Compared with U-Net semantic segmentation model, FCN semantic segmentation model, DeepLabV3+ semantic segmentation model, and SegNet semantic segmentation model, the mean intersection and union ratio is increased by 7.6%, 2.4%, 2.6%and 1.5% respectively.

Key words: Key words: document layout analysis, multi-scale attention, semantic segmentation, channel attention

CLC Number: