Computer and Modernization ›› 2024, Vol. 0 ›› Issue (03): 24-28.doi: 10.3969/j.issn.1006-2475.2024.03.004

Previous Articles     Next Articles

Chinese Named Entity Recognition with Fusion of Lexicon Information and Sentence Semantics#br# #br#

  

  1. (School of Computer Science, Xi’an Polytechnic University, Xi’an 710048, China)
  • Online:2024-03-28 Published:2024-04-28

Abstract: Abstract: The performance of named entity recognition tasks has significantly improved due to the rapid advancement of deep learning techniques. However, the outstanding results achieved by deep learning networks often rely on large amounts of labeled samples, making it challenging to fully exploit deep information in small datasets. In this paper, we propose a Chinese named entity recognition model (LS-NER) that combines lexicon and sentence semantics. Firstly, potential words matched by characters in the dictionary serve as a priori lexical information for the model, addressing the Chinese word segmentation issue. Then, sentence embeddings containing semantic information, typically used for calculating text similarity, are applied to the named entity recognition task, enabling the model to identify similar entities from analogous sentences. Finally, a feature fusion strategy is devised to allow the model to effectively learn the semantic information provided by sentence embeddings. The experimental results demonstrate that our approach achieves commendable performance on both small datasets Resume and Weibo. The incorporation of sentence semantics assists the model in learning deeper features without requiring additional external information, resulting in F1 scores that are 0.15 percentage points and 2.26 percentage points higher than those of the model without added sentence information, respectively.
Key words: named entity recognition; BERT; SoftLexicon; Sentence-Bert; CRF

CLC Number: