Computer and Modernization ›› 2024, Vol. 0 ›› Issue (01): 87-91.doi: 10.3969/j.issn.1006-2475.2024.01.014

Previous Articles     Next Articles

Named Entity Recognition in Electronic Medical Record Based on BERT

  

  1. (1. School of Information Science and Engineering, Hunan University of Chinese Medicine, Changsha 410208, China;
    2. School of Computer Science and Engineering, Central South University, Changsha 410083, China)
  • Online:2024-01-23 Published:2024-02-26

Abstract: Abstract:Electronic medical record is an important resource for the preservation, management and transmission of patients’medical records. It is also an important text record for doctors’ diagnosis and treatment of diseases. Through the electronic medical record named entity recognition (NER) technology, diagnosis and treatment information such as symptoms, diseases and drug names can be extracted from the electronic medical record efficiently and intelligently. It is helpful for structured electronic medical records to use machine learning and other technologies for diagnosis and treatment regularity mining. In order to efficiently identify named entities in electronic medical records, a named entity recognition method based on BERT and bidirectional long short-term memory network (BILSTM) with fusion adversarial training (FGM) is proposed, referred to as BERT-BILSTM-CRF-FGM (BBCF). After preprocessing by correcting the Chinese electronic medical record corpus provided by the 2017 National Knowledge Graph and Semantic Computing Conference (CCKS2017), the BERT-BILSTM-CRF-FGM model is used to recognize five types of entities in the corpus, with an average F1 score of 92.84%. Compared to the BERT model based on the inflated convolutional neural network (BERT-IDCNN-CRF) and the conditional random field model based on BILSTM (BILSTM-CRF), the proposed method has higher F1 score and faster convergence speed, which can more efficiently structure electronic medical record text.

Key words: Key words:electronic medical record, named entity recognition, BERT, FGM, BILSTM (Bidirectional Long Short-Term Memory Network), conditional random field

CLC Number: