Computer and Modernization ›› 2022, Vol. 0 ›› Issue (09): 40-50.

Previous Articles     Next Articles

Entity Recognition Method on EMR Based on Multi-task Learning

  

  1. (School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China)
  • Online:2022-09-22 Published:2022-09-22

Abstract: Named entity recognition of Chinese EMR is the difficulty in medical information extraction. This paper proposes a multi-task learning mechanism to recognize entity which jointly entity recognition and word segmentation training. The private layers based on Bi-LSTM are used to extract private features, the attention network is used as the shared layer and the general feature enhancement mechanism is added to filter the gobal information, which reduces the risk of over-fitting and enhanced the model generalization ability. Moreover, the balanced oversampling method is proposed to augment EMR dataset, which effectively solves the problem caused by the huge discrepancy in EMR entity types. The CCKS2017/CCKS2020 EMR entity recognition dataset and medicine word segmentation dataset are used for joint learning. The experimental results show that the overall performance in EMR entity recognition is significantly improved, and the word segmentation benchmark in medicine dataset is also raised by 3 percent points in F1 value. The detailed analysis show that the proposed model can effectively correct the entity chunking errors caused by irregular writing style, unstructured text or professional nouns/terms in EMR dataset.

Key words: deep learning, named entity recognition, multi-task learning, neural network, attention mechanism