Computer and Modernization ›› 2024, Vol. 0 ›› Issue (01): 41-46.doi: 10.3969/j.issn.1006-2475.2024.01.007

Previous Articles     Next Articles

Scenes Text Modification Network for Uyghur Based on Generative Adversarial Network

  

  1. (1. School of Computer and Information Engineering, Xinjiang Agricultural University, Urumqi 830052, China; 
    2. Xinjiang Agricultural Informatization Engineering Technology Research Center, Urumqi 830052, China;
    3. Multilingual Information Technology Laboratory, Xinjiang Technology Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China)
  • Online:2024-01-23 Published:2024-02-23

Abstract: Abstract: Through the study of scene text detection and recognition in Uyghur languages, it is found that manual acquisition of labeled natural scene text images is time-consuming and labor-intensive. Therefore, artificially synthesized data is used as the main source of training data. To obtain more realistic data,  a scenes text modification network for Uyghur based on generative adversarial network is proposed. The efficient Transformer module is used to construct the network for fully extracting the global and local features of the image to complete the modification of the Uyghur, and a fine-tuning module is added to fine-tune the final results. The model is trained with WGAN thought strategy, which can effectively cope with the problems of pattern collapse as well as gradient explosion. The generalization ability and robustness of the model are verified by text modification experiments in English-English and English-Virginia. Good results are achieved in both objective metrics (SSIM, PSNR) and visual effects, and are validated on real scene datasets SVT and ICDAR 2013.

Key words: Key words: , generative adversarial networks, scene text editing, Uyghur scene text image, efficient Transformer, WGAN

CLC Number: