基于多任务学习的电子病历实体识别方法

摘要/Abstract

摘要： 中文电子病历NER是医疗信息抽取的难点。本文提出一种多任务学习的实体识别方法，联合实体识别和分词训练模型，使用基于Bi-LSTM的私有层提取专有信息，融合注意力网络作为共享层并增加通用特征增强机制来筛选全局信息，降低过拟合风险并增强模型的泛化能力。此外提出均衡样本过采样方法扩充数据集，有效解决实体类别不平衡所带来的问题。使用CCKS2017/CCKS2020电子病历实体识别语料和Medicine医药分词语料联合训练，实验结果显示本文提出的模型整体性能提升明显，同时也显著提高了Medicine语料的分词实验效果，F1值较基线提升了3个百分点。实验表明本文提出的模型能够有效改善因电子病历中数据不规范、无结构或专有名词等原因造成的实体切分错误等问题。

关键词: 深度学习, 命名实体识别, 多任务学习, 神经网络, 注意力机制

Abstract: Named entity recognition of Chinese EMR is the difficulty in medical information extraction. This paper proposes a multi-task learning mechanism to recognize entity which jointly entity recognition and word segmentation training. The private layers based on Bi-LSTM are used to extract private features, the attention network is used as the shared layer and the general feature enhancement mechanism is added to filter the gobal information, which reduces the risk of over-fitting and enhanced the model generalization ability. Moreover, the balanced oversampling method is proposed to augment EMR dataset, which effectively solves the problem caused by the huge discrepancy in EMR entity types. The CCKS2017/CCKS2020 EMR entity recognition dataset and medicine word segmentation dataset are used for joint learning. The experimental results show that the overall performance in EMR entity recognition is significantly improved, and the word segmentation benchmark in medicine dataset is also raised by 3 percent points in F1 value. The detailed analysis show that the proposed model can effectively correct the entity chunking errors caused by irregular writing style, unstructured text or professional nouns/terms in EMR dataset.

Key words: deep learning, named entity recognition, multi-task learning, neural network, attention mechanism

于鹏, 陈钰枫, 徐金安, 张玉洁. 基于多任务学习的电子病历实体识别方法[J]. 计算机与现代化, 2022, 0(09): 40-50.

YU Peng, CHEN Yu-feng, XU Jin-an, ZHANG Yu-jie. Entity Recognition Method on EMR Based on Multi-task Learning[J]. Computer and Modernization, 2022, 0(09): 40-50.

参考文献

［1］ LAFFERTYJ D, MCCALLUM A, PEREIRA F, et al. Conditional random fields: Probabilistic models for segmenting and labeling sequence data［C］// Proceedings of the 18th International Conference on Machine Learning. 2001:282-289.
［2］ COLLOBERT R, WESTON J, BOTTOU L, et al. Natural language processing （almost） from scratch［J］. Journal of Machine Learning Research, 2011,12:2493-2537.
［3］ HUANG Z H, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging［J］. arXiv preprint arXiv:1508.01991, 2015.
［4］ LI P H, FU T J, MA W Y. Why attention? Analyze BiLSTM deficiency and its remedies in the case of NER［J］. Computation and Language, 2020,34（5）:8236-8244.
［5］陈伟,吴友政,陈文亮,等. 基于BiLSTM-CRF的关键词自动抽取［J］. 计算机科学, 2018,45（Z1）:91-96.
［6］ ZHANG Y, YANG J．Chinese NER using lattice LSTM［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics． 2018:1554-1564.
［7］ PETERS M E, NEUMANN M, IYYER M. Deep contextualized word representations［C］// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics. 2018:2227-2237.
［8］ DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding［J］. arXiv preprint arXiv:1810.04805, 2018.
［9］ JIA C, SHI Y F, YANG Q R, et al. Entity enhanced BERT pre-training for Chinese NER［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing （EMNLP）. 2020:6384-6396.
［10］王得贤,王素格,裴文生,等. 基于JCWA-DLSTM 的法律文书命名实体识别方法［J］. 中文信息学报, 2020,34（10）:51-58.
［11］LI X Y, FENG J R, MENG Y X, et al. A unified MRC framework for named entity recognition［J］. arXiv preprint arXiv:1910.11476, 2019.
［12］LI Y M, LI H, YAO K S, et al. Handling rare entities for neural sequence labeling［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020:6441-6451.
［13］JIA C, LIANG X B, ZHANG Y. Cross-domain NER using cross-domain language modeling［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:2464-2474.
［14］SCHNEIDER E T R, DE SOUZA J V A, KNAFOU J, et al.BioBERTpt:A portuguese neural language model for clinical named entity recognition［C］// Proceedings of the 3rd Clinical Natural Language Processing Workshop. 2020:65-72.
［15］赵耀全,车超,张强. 基于新词发现和Lattice-LSTM的0中文医疗命名实体识别［J］. 计算机应用与软件, 2021,38（1）:161-165.
［16］罗熹,夏先运,安莹,等. 结合多头自注意力机制与BiLSTM-CRF的中文临床实体识别［J］. 湖南大学学报（自然科学版）, 2021,48（4）:45-55.
［17］张旭,朱艳辉,梁文桐,等. 基于SoftLexicon的医疗实体识别模型［J］. 湖南工业大学学报, 2021,35（5）:77-84.
［18］RUDER S, BINGEL J, AUGENSTEIN I, et al. Latent multi-task architecture learning［J］. Machine Learning, 2019,33（1）:4822-4829.
［19］DING N, LONG D K, XU G W, et al. Coupling distant annotation and adversarial training for cross-domain Chinese word segmentation［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 6662-6671.
〖HJ1.1mm〗
［20］COLLOBERT R, WESTON J. A unified architecture for natural language processing: Deep neural networks with multitask learning［C］// The 25th International Conference on Machine Learning（ICML）. 2008:160-167.
［21］PENG N Y, DREDZE M. Multi-task multi-domain representation learning for sequence tagging［J］.arXiv preprint arXiv:1608.02689, 2016.
［22］CHEN X C, SHI Z, QIU X P, et al. Adversarial multi-criteria learning for Chinese word segmentation［C］// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017:1193-1203.
［23］ZHAO S D, LIU T, ZHAO S C, et a1. A neural multitask learning framework to jointly model medical named entity recognition and normalization［C］// Proceedings of the AAAI Conference on Artificial Intelligence. 2019:817-824.
［24］LI N, LUO L, DING Z Y, et al. Improving Chinese clinical named entity recognition using stroke ELMo and transfer learning［C］// Proceedings of the Evaluation Tasks at the China Conference on Knowledge Graph and Semantic Computing（CCKS-Tasks 2019）. 2019.
［25］HAN H, WANG W Y, MAO B H. Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning［C］// International Conference on Intelligent Computing. 2005:878-887.
［26］VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need［C］// Advances in Neural Information Processing Systems. 2017:5998-6008.
［27］PENNINGTON J, SOCHER R, MANNING C D. Glove: Global vectors for word representation［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Languageprocessing（EMNLP）. 2014:1532-1543.
［28］MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space［J］.arXiv preprint arXiv:1301.3781, 2013.
［29］毋雪雁,王水花,张煜东. K最近邻算法理论与应用综述［J］. 计算机工程与应用, 2017,53（21）:1-7.
［30］CHIU J P C, NICHOLS E. Named entity recognition with bidirectional LSTM-CNNs［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016,4:357-370.
［31］巩敦卫,张永凯,郭一楠,等. 融合多特征嵌入与注意力机制的中文电子病历命名实体识别［J］. 工程科学学报, 2021,43（9）:1190-1196.
［32］WANG Q, ZHOU Y M, RUAN T, et al． Incorporating dictionaries into deep neural networks for the Chinese clinical named entity recognition［J］. Journal of Biomedical Informatics, 2019,92:103-133.
［33］XING J J, ZHU K, ZHANG S D. Adaptive multi-task transfer learning for Chinese word segmentation in medical text［C］// Transactions of the Association for Computational Linguistics. 2018:3619-3630.

[1]	何思达, 陈平华. 基于意图的轻量级自注意力序列推荐模型[J]. 计算机与现代化, 2024, 0(12): 1-9.
[2]	赵晨阳, 薛涛, 刘俊华. 基于改进Stable Diffusion的时尚服饰图案生成[J]. 计算机与现代化, 2024, 0(12): 15-23.
[3]	黄庭培1, 马禄彪1, 李世宝2, 刘建航1. 基于WiFi和原型网络的手势识别方法[J]. 计算机与现代化, 2024, 0(12): 34-39.
[4]	张晓东1, 白广芝1, 李敏1, 李昊洋2. 基于经验小波变换的油气井产量预测模型 [J]. 计算机与现代化, 2024, 0(12): 53-58.
[5]	刘云海1, 冯广1, 吴晓婷2, 杨群2. 复杂施工场景下的安全帽佩戴检测算法[J]. 计算机与现代化, 2024, 0(12): 66-71.
[6]	刘宝宝, 杨菁菁, 陶露, 王贺应. 基于注意力的DSMSC的遥感图像场景分类[J]. 计算机与现代化, 2024, 0(12): 72-77.
[7]	谷岳, 邓松峰, 沈霁, 穆文涛, 赵恩棋. 基于改进YOLOv8的SAR舰船目标检测算法[J]. 计算机与现代化, 2024, 0(12): 78-83.
[8]	王艳媛, 茅正冲. 中英文场景文本图像的检测和识别算法[J]. 计算机与现代化, 2024, 0(12): 84-90.
[9]	陈亮, 李诚, 易伟, 熊伟, 汪晓帆, 唐海东. 基于毫米波雷达与视觉融合的电力现场安全帽佩戴检测[J]. 计算机与现代化, 2024, 0(12): 100-107.
[10]	李钧超1, 尤菲1, 张超2, 苏乐乐2, 龚龑2. 基于新型多目标浣熊优化算法的BiLSTM-Attention#br# 预测模型及误差分析[J]. 计算机与现代化, 2024, 0(11): 70-76.
[11]	张宇1, 2, 黎靖1, 2, 马铭1, 2, 王众祥1, 2, 孙妍1, 2. YOLOLW:一个新的轻量级目标检测模型[J]. 计算机与现代化, 2024, 0(11): 91-98.
[12]	祁贤, 刘大铭, 常佳鑫. 基于改进自注意力机制的多视图三维重建[J]. 计算机与现代化, 2024, 0(11): 106-112.
[13]	陈凯1, 李宜汀1, 2, 全华凤1 . 基于改进YOLOv8的河道废弃瓶检测方法[J]. 计算机与现代化, 2024, 0(11): 113-120.
[14]	杨骏1, 胡为1, 朱文福2. 基于改进MobileNetV3的视觉SLAM回环检测算法[J]. 计算机与现代化, 2024, 0(10): 21-26.
[15]	魏学诚1, 江凌云1, 李研2, 何非2. 改进YOLOv5的路侧单目视角小目标检测算法[J]. 计算机与现代化, 2024, 0(10): 27-34.