Computer and Modernization ›› 2025, Vol. 0 ›› Issue (02): 86-93.doi: 10.3969/j.issn.1006-2475.2025.02.012

Previous Articles     Next Articles

Twin Feature Fusion Network for Scene Text Image Super Resolution

  

  1. (School of Computer Science, Xi’an Polytechnic University, Xi’an 710060, China)
  • Online:2025-02-28 Published:2025-02-28

Abstract: The aim of the scene text image super-resolution (STISR) method is to enhance the resolution and legibility of text images, thereby improving the performance of downstream text recognition tasks. Previous studies have shown that the introduction of text-prior information can better guide the super-resolution. However, these methods have not effectively utilized text-prior information and have not fully integrated it with image features, limiting super-resolution task performance. In this paper, we propose a Twin Feature Fusion Network (TFFN) to address this problem. The method aims to maximize the utilization of text-prior information from pre-trained text recognizers, with a focus on the recovery of text area content. Firstly, text-prior information is extracted using a text recognition network. Next, a twin feature fusion module is constructed, which employs a twin attention mechanism to facilitate bidirectional interaction between image features and text-prior information. The fusion module further integrates context-enhanced image features and text-prior information. Finally, sequence features are extracted to reconstruct the text image. Experiments on the benchmark TextZoom dataset show that the proposed TFFN improves the recognition accuracy of the ASTER, MORAN, and CRNN text recognition networks by 0.22~0.5, 0.6~1.1 and 0.33~1.1 percentage points, respectively.

Key words: super-resolution reconstruction, text images, feature fusion, self-attention mechanism, cross-attention mechanism

CLC Number: