Computer and Modernization

Previous Articles     Next Articles

Research on Named Entity Recognition Method in Specific Fields

  

  1. (School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China)
  • Received:2017-07-12 Online:2018-04-03 Published:2018-04-03

Abstract: For named entity recognition technology in a specific domain, there are various identification methods corresponding to different fields. Different fileds of texts have their own unique textual features, which leads to the existing identification method is difficult to adapt to new specific domain. In order to solve this problem, this paper proposes a method based on conditional random field, semi-supervised learning and active learning, which forms a unified technical framework to adapt to the named entity recognition in each specific domain. This method constructs the feature set based on characteristics of rail transit text, then trains CRF to recognize named-entity of rail traffic text, and selects the samples with lower confidence level than the selected threshold, and then manually extends the training samples to achieve high goals. In order to validate the method, this paper carries on the experiment in the field of rail transit. The experimental results show that the method is effective and has a good recognition effect in the field of rail transit.

Key words: active learning, semi-supervised, conditional random field(CRF), named entity recognition(NER), specific domain

CLC Number: