基于BERT-FNN的意图识别分类

摘要/Abstract

摘要： 意图识别分类是自然语言处理领域的一个热点问题，在智能机器人、智能客服中如何根据上下文理解用户意图是一个重点问题，同时也是一个难点问题。传统的意图识别分类主要是采用基于规则、模板匹配的正则化方法或基于机器学习的方法，然而却存在计算成本高、泛化能力差的问题。针对上述问题，本文设计以Google公开的BERT预训练语言模型为基础，进行输入文本的上下文建模和句级别的语义表示,采用［cls］符号(token)对应的向量代表文本的上下文，再通过全连接神经网络(FNN)对语句进行特征提取，为了充分利用数据，本文利用拆解法的思想，将多分类问题转换成多个二分类问题处理，每次将一个类别作为正例，其余类别均作为负例，产生多个二分类任务，从而实现意图分类。实验结果表明，该方法性能优于传统模型，可以获得94%的准确率。

关键词: 自然语言处理, 意图识别, BERT, FNN, 拆解法

Abstract: Intention recognition classification is an important question in the field of natural language processing. How to understand the user’s intention based on context is a key and difficult problem in intelligent robots and intelligent customer service. Traditional intention recognition classification is mainly based on regularization methods or machine learning methods. However, there are problems of high computational cost and poor generalization ability. In response to the above problems, the design of this paper is based on Google’s BERT pre-training language model to perform context modeling and sentence-level semantic representation of the text, uses the vector corresponding to the ［cls］ token to represent the context of the text, then, extracts the feature of sentences through fully-connected neural network (FNN). In order to make full use of the data, this paper uses the idea of disassembly method to convert the multi-classification problem into multiple binary classification problems. Each time, one category is used as a positive example, and the remaining categories are used as negative examples, which generates multiple two-classification tasks so as to achieve intention classification. Experimental results show that the performance of this method is better than the traditional model, and the accuracy of this method is 94%.

Key words: natural language processing, intention recognition, BERT, FNN, dismantling method

郑新月, 任俊超. 基于BERT-FNN的意图识别分类[J]. 计算机与现代化, 2021, 0(07): 71-76.

ZHENG Xin-yue, REN Jun-chao. Intention Recognition and Classification Based on BERT-FNN[J]. Computer and Modernization, 2021, 0(07): 71-76.

参考文献

［1］许坤,冯岩松,赵东岩. 面向知识库的中文自然语言问句的语义理解［J］. 北京大学学报（自然科学版）, 2014,50(1):85-92.
［2］ WANG X J, ZHANG L, MA W Y. Answer ranking in community question-answering sites. US 8346701［P］. 2013-01-01.
［3］ ANGELINO E, LARUS-STONE N, ALABI D, et al. Learning certifiably optimal rule lists for categorical data［J］. The Journal of Machine Learning Research, 2017,18(1):8753-8830.
［4］ GREGOROMICHELAKI E, KEMPSON R M, PURVER M, et al. Incrementality and intention-recognition in utterance processing［J］. Dialogue & Discourse, 2011,2(1):199-233.
［5］ RAMANAND J, BHAVSAR K, PEDANEKAR N. Wishful thinking-finding suggestions and ‘buy’ wishes from product reviews［C］// Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text. 2010:54-61.
［6］ ZHANG D L, YAO L N, CHEN K X, et al. Making sense of spatio-temporal preserving representations for EEG-based human intention recognition［J］. IEEE Transactions on Cybernetics, 2019,50(7):3033-3044.
［7］陈浩辰. 基于微博的消费意图挖掘［D］. 哈尔滨:哈尔滨工业大学, 2014.
［8］李超,柴玉梅,南晓斐,等. 基于深度学习的问题分类方法研究［J］. 计算机科学, 2016(12):115-119.
［9］ KIM Y. Convolutional neural networks for sentence classification［J］. arXiv preprint arXiv:1408.5882, 2014.
［10］HASHEMI H B, ASIAEE A, KRAFT R. Query intent detection using convolutional neural networks［C］// Proceedings of the International Conference on Web Search and Data Mining, Workshop on Query Understanding. 2016. DOI: 10.1145/1235.
［11］RAVURI S, STOLCKE A. Recurrent neural network and LSTM models for lexical utterance classification［C］// Proceedings of the 16th Annual Conference of the International Speech Communication Association. 2015:1597-1600.

［12］DEY R, SALEM F M. Gate-variants of gated recurrent unit (GRU) neural networks［C］// Proceedings of the 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS). 2017:1597-1600.

［13］RAVURI S, STOLCKE A. A comparative study of recurrent neural network models for lexical domain classification［C］// Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2016:6075-6079.
［14］LIN Z H, FENG M W, SANTOS C N D, et al. A structured self-attentive sentence embedding［J］. arXiv preprint arXiv:1703.03130, 2017.
［15］CAI R, ZHUB, JI L, et al. An CNN-LSTM attention approach to understanding user query intent from online health communities［C］// Proceedings of the 2017 IEEE International Conference on Data Mining Workshops (ICDMW). 2017:430-437.
［16］DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding［J］. arXiv preprint arXiv:1810.04805, 2018.
［17］SUN C, QIU X P, XU Y G, et al. How to fine-tune BERT for text classification?［C］// Proceedings of the China National Conference on Chinese Computational Linguistics. 2019:194-206.
［18］ETHAYARAJH K. How contextual are contextualized word representations? Comparing the geometry of BERT,ELMo, and GPT-2 embeddings［J］. arXiv preprint arXiv:1909.00512, 2019.
［19］TANG M, GANDHI P, KABIR M A, et al. Progress notes classification and keyword extraction using attention-based deep learning models with BERT［J］. arXiv preprint arXiv:1910.05786, 2019.
［20］ALSENTZER E, MURPHY J R, BOAG W, et al. Publicly available clinical BERT embeddings［J］. arXiv preprint arXiv:1904.03323, 2019.
［21］VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017:6000-6010.
［22］张志昌,张珍文,张治满. 基于IndRNN-Attention的用户意图分类［J］. 计算机研究与发展, 2019,56(7):1517-1524.
［23］LIU Y H, OTT M, GOYAL N, et al. RoBERTa: A robustly optimized BERT pretraining approach［J］. arXiv preprint arXiv:1907.11692, 2019.
［24］SZE V, CHEN Y H, YANG T J, et al. Efficient processing of deep neural networks: A tutorial and survey［J］. Proceedings of the IEEE. 2017,105(12):2295-2329.

[1]	郑久超, 赵新元. 基于主题与描述信息的实体链接方法[J]. 计算机与现代化, 2024, 0(12): 10-14.
[2]	马钰, 杨勇, 任鸽, 帕力旦·吐尔逊. 基于GCN和微调BERT的作文自动评分方法[J]. 计算机与现代化, 2024, 0(09): 33-37.
[3]	赵盾1, 佘学兵2, 邬昌兴3. 基于BERT-BiLSTM-CRF党建领域命名实体识别[J]. 计算机与现代化, 2024, 0(09): 91-94.
[4]	李璐, 朱焱. 基于知识提示微调的事件抽取方法[J]. 计算机与现代化, 2024, 0(07): 36-40.
[5]	张可1, 艾中良2, 刘忠麟3, 顾平莉1, 刘学林4. 基于多元组匹配损失的司法论辩理解方法[J]. 计算机与现代化, 2024, 0(06): 115-120.
[6]	王谭, 陈金广, 马丽丽. 融合词典信息和句子语义的中文命名实体识别[J]. 计算机与现代化, 2024, 0(03): 24-28.
[7]	郑立瑞, 肖晓霞, 邹北骥, 刘彬, 周展. 基于BERT的电子病历命名实体识别[J]. 计算机与现代化, 2024, 0(01): 87-91.
[8]	刘玉鹏, 葛艳, 杜军威, 陈卓. 融合FGM和指针标注的实体关系联合抽取方法[J]. 计算机与现代化, 2023, 0(11): 1-5.
[9]	唐诗琪, 周瑞平, 谢仕斌, 刘梦赤, 肖文, . 基于栈式降噪编码器的跨语言多标签情感分类[J]. 计算机与现代化, 2023, 0(11): 6-12.
[10]	李诗月, 孟佳娜, 于玉海, 李雪莹, 许英傲. 基于知识增强的方面级情感分析方法[J]. 计算机与现代化, 2023, 0(10): 1-8.
[11]	谢世超, 黄蔚, 任祥辉. 一种基于BERT的文本实体链接方法[J]. 计算机与现代化, 2023, 0(02): 58-61.
[12]	朱亚军, 拥措, 尼玛扎西, . 基于藏文BERT的藏医药医学实体识别[J]. 计算机与现代化, 2023, 0(01): 43-48.
[13]	王浩畅, 刘如意. 基于预训练模型的关系抽取研究综述[J]. 计算机与现代化, 2023, 0(01): 49-57.
[14]	于清, 马志龙, 徐春. 基于BERT和非自回归的医疗知识抽取[J]. 计算机与现代化, 2023, 0(01): 120-126.
[15]	黄忠祥, 李明. ALBERT结合双向网络的文本分类[J]. 计算机与现代化, 2022, 0(10): 8-12.