Computer and Modernization ›› 2022, Vol. 0 ›› Issue (08): 7-12.

Previous Articles     Next Articles

Chinese Word Segmentation Model Based on Attention-BIGRU-CRF

  

  1. (College of Communication and Information Engineering, Nanjing University
    of Post and Telecommunication, Nanjing 210003, China)
  • Online:2022-08-22 Published:2022-08-22

Abstract: Natural language processing is an important branch of the development of artificial intelligence, and Chinese word segmentation is the first step of natural language processing. Improving the efficiency of Chinese word segmentation can improve the accuracy of the results of natural language processing. Therefore, an Attention-BIGRU-CRF-CRF model is proposed in this paper. Firstly, the Chinese text is transformed into vector form through word vector conversion, and then the BIGRU is used for serialization learning. Then, the attention mechanism is introduced to calculate the correlation between the input and output of BIGRU to obtain more accurate vector values, Finally, the vector value is spliced with the vector value serialized by BIGRU as the input of CRF layer, and the label prediction result is obtained. The simulation results show that the F1 values of Attention-BIGRU-CRF model in the corpus of people’s daily 2014 and MSRA are 97.34% and 98.25% respectively, and the word segmentation rate of processed text is 248.1 KB/s. Therefore, the model integrating attention mechanism and BIGRU-CRF network can not only improve the accuracy of word segmentation, but also improve the time efficiency of word segmentation.

Key words: BIGRU, CRF, attention mechanism, Chinese word segmentation