基于LoRA高效微调通用语言大模型的文本立场检测

doi:10.3969/j.issn.1006-2475.2025.01.001

摘要/Abstract

摘要： 立场检测是自然语言处理中的一个关键任务，它基于文本分析来判断作者的立场。文本立场检测方法从早期的机器学习方法过渡到BERT模型，然后发展到最新的大语言模型，如ChatGPT。由于受限于ChatGPT的闭源特性，本文利用国内开源的ChatGLM3模型，提出一种文本立场检测模型ChatGLM3-LoRA-Stance。为了将大模型有效地应用于专业垂直领域，采用LoRA这一高效的微调方法。与P-Tuning V2相比，LoRA更能适应本文中的零样本和少样本文本立场检测任务。使用公开的VAST数据集对ChatGLM3模型进行微调，评估现有模型在零样本和少样本场景中的性能。实验结果显示，ChatGLM3-LoRA-Stance模型在零样本和少样本检测任务上，F1得分均显著高于其他模型。因此，研究结果凸显了大语言模型在文本立场检测任务上的潜力，并表明使用LoRA高效微调技术能够显著提升ChatGLM3大语言模型在文本立场检测任务中的性能。

关键词: LoRA微调, 通用语言大模型GLM, 立场检测, 零样本和少样本检测

Abstract: Stance detection is a key task in natural language processing， which determines the stance of an author based on text analysis. Text stance detection methods transition from early machine learning methods to BERT models， and then evolve to the latest large language models such as ChatGPT. Distinguishing from the closed-source feature of ChatGPT， this paper proposes a text stance detection model， ChatGLM3-LoRA-Stance， by using the domestic open-source ChatGLM3 model. In order to apply large models in professional vertical fields， this paper uses LoRA efficient fine-tuning method. Compared with P-Tuning V2 efficient fine-tuning method， LoRA is more suitable for zero-shot and few-shot text stance detection tasks in text. The paper uses the publicly available VAST dataset to fine-tune the ChatGLM3 model， evaluating the performance of existing models in zero-shot and few-shot scenarios. Experimental results indicate that ChatGLM3-LoRA-Stance model has significantly higher F1 scores than other models on zero-shot and few-shot detection tasks. Therefore， the results verify the potential of large language models on text stance detection tasks， and suggest that that the use of LoRA efficient fine-tuning technology can significantly improve the performance of ChatGLM3 large language model in text stance detection tasks.

Key words: , LoRA-based fine-tuning, general language large model GLM, stance detection, zero-shot and few-shot detection

中图分类号:

TP391

韩霄龙, 曾曦, 刘锟, 尚钰. 基于LoRA高效微调通用语言大模型的文本立场检测[J]. 计算机与现代化, 2025, 0(01): 1-6.

HAN Xiaolong, ZENG Xi, LIU Kun, SHANG Yu. Stance Detection with LoRA-based Fine-tuning General Language Model[J]. Computer and Modernization, 2025, 0(01): 1-6.

参考文献

［1］李洋，孙宇晴，景维鹏. 文本立场检测综述［J］. 计算机研究与发展， 2021，58（11）：2538-2557.
［2］李贞. 社交媒体舆情治理：从系统分析到简约治理［J］. 贵州社会科学， 2020（8）：39-46.
［3］奠雨洁，金琴，吴慧敏. 基于多文本特征融合的中文微博的立场检测［J］. 计算机工程与应用， 2017，53（21）：77-84.
［4］ MOHAMMAD S M， SOBHANI P， KIRITCHENKO S. Stance and sentiment in tweets［J］. ACM Transactions on Internet Technology （TOIT）， 2017，17（3）：1-23.
［5］ MOURAD S S， SHAWKY D M， FAYED H A， et al. Stance detection in tweets using a majority vote classifier［C］// The International Conference on Advanced Machine Learning Technologies and Applications （AMLTA2018）. Springer， 2018：375-384.
［6］ XUAN K Z， XIA R. Rumor stance classification via machine learning with text， user and propagation features［C］// 2019 International Conference on Data Mining Workshops （ICDMW）. IEEE， 2019：560-566.
［7］ AL-GHADIR A I， AZMI A M， HUSSAIN A. A novel approach to stance detection in social media tweets by fusing ranked lists and sentiments［J］. Information Fusion， 2021，67：29-40.
［8］白静，李霏，姬东鸿. 基于注意力的BiLSTM-CNN中文微博立场检测模型［J］. 计算机应用与软件， 2018，35（3）：266-274.
［9］ DU J C， XU R F， HE Y L， et al. Stance classification with target-specific neural attention networks［C］// Proceedings of the 26th International Joint Conference on Artificial Intelligence. AAAI， 2017：3988-3994.
［10］ ZHANG Y Z， MA D， TIWARI P， et al. Stance-level sarcasm detection with BERT and stance-centered graph attention networks［J］. ACM Transactions on Internet Technology， 2023，23（2）. DOI： 10.1145/353343.
［11］ LAN X C， GAO C， JIN D P， et al. Stance detection with collaborative role-infused llm-based agents［C］// Proceedings of the 8th International AAAI Conference on Web and Social Media. ICWSM， 2024，18：891-903.
［12］ DEVLIN J， CHANG M W， LEE K， et al. BERT： Pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. ACL， 2019：4171-4186.
［13］ BROWN T B， MANN B， RYDER N， et al. Language models are few-shot learners［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Curran Associates Inc.. 2020，33：1877-1901.
［14］ RAFFEL C， SHAZEER N， ROBERTS A， et al. Exploring the limits of transfer learning with a unified text-to-text transformer［J］. Journal of Machine Learning Research， 2020，21（1）：5485-555.
［15］ DU Z X， QIAN Y J， LIU X， et al. GLM： General language model pretraining with autoregressive blank infilling ［C］// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. ACL， 2022：320-335.
［16］程乐超. 视觉大模型参数高效微调技术应用与展望［J］. 人工智能， 2024（1）：54-65.
［17］ DING N， QIN Y J， YANG G， et al. Parameter-efficient fine-tuning of large-scale pre-trained language models［J］. Nature Machine Intelligence， 2023，5（3）：220-235.
［18］ HOULSBY N， GIURGIU A， JASTRZEBSKI S， et al. Parameter-efficient transfer learning for NLP［C］// Proceedings of the 36th International Conference on Machine Learning， PMLR， 2019：2790-2799.
［19］ XIN C L， LU Y J， LIN H Y， et al. Beyond full fine-tuning： Harnessing the power of LoRA for multi-task instruction tuning［C］// Proceedings of the 2024 Joint International Conference on Computational Linguistics， Language Resources and Evaluation （LREC-COLING 2024）. ACL， 2024：2307-2317.
［20］ LIN S H， CHEN W， GAO Y P， et al. KPatch： Knowledge patch to pre-trained language model for zero-shot stance detection on social media［C］// Proceedings of the 2024 Joint International Conference on Computational Linguistics， Language Resources and Evaluation （LREC-COLING 2024）. ACL， 2024：9961-9973.
［21］ BIRUNDA S S， DEVI R K. A review on word embedding techniques for text classification［C］// Proceedings of Innovative Data Communication Technologies and Application. Springer， 2021：267-281.
［22］ SU J L， AHMED M， LU Y， et al. Roformer： Enhanced transformer with rotary position embedding［J］. Neurocomputing， 2024，568. DOI： 10.1016/j.neucom.2023.127063.
［23］ VAN DER OUDERAA T F A， VAN DER WILK M. Learning invariant weights in neural networks［C］// Proceedings of the 38th Conference on Uncertainty in Artificial Intelligence. PMLR， 2022：1992-2001.
［24］ KARLIK B， OLGAC A V. Performance analysis of various activation functions in generalized MLP architectures of neural networks［J］. International Journal of Artificial Intelligence and Expert Systems， 2011，1（4）：111-122.
［25］ YANG Z L， DAI Z H， YANG Y M， et al. XLNet： Generalized autoregressive pretraining for language understanding［C］// Proceedings of the 33rd International Conference on Neural Information Processing Systems. ACM， 2019：5753-
5763.
［26］ REBUFFI S A， BILEN H， VEDALDI A. Learning multiple visual domains with residual adapters［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. ACM， 2017：506-516.
［27］ ALLAWAY E， MCKEOWN K. Zero-shot stance detection： A dataset and model using generalized topic representations［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. ACL， 2020：8913-8931.
［28］ HE Z H， MOKHBERIAN N， LERMAN K. Infusing knowledge from Wikipedia to enhance stance detection［C］// Proceedings of the 12th Workshop on Computational Approaches to Subjectivity， Sentiment & Social Media Analysis. ACL， 2022：71-77.
［29］ ZHANG B W， FU X H， DING D J， et al. Investigating chain-of-thought with ChatGPT for stance detection on social media［J］. arXiv preprint arXiv：2304.03087， 2023.
［30］ LIU Y H， OTT M， GOYAL N， et al. RoBERTa： A robustly optimized bert pretraining approach［J］. arXiv preprint arXiv：1907.11692， 2019.
［31］ LI A， LIANG B， ZHAO J Q， et al. Stance detection on social media with background knowledge［C］// Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. ACL， 2023：15703-15717.

[1]	徐胜超, 陈富强 . 基于BP神经网络的食品安全风险预警方法[J]. 计算机与现代化, 2025, 0(01): 20-24.
[2]	陈思贇1, 马怀波2, 张华君2, 兰子柠2, 陈文鑫2, 胡杰1, 常胜1. 基于国产AI芯片的目标检测算法优化与部署[J]. 计算机与现代化, 2025, 0(01): 25-29.
[3]	袁杰, 朱焱 . 基于属性异质图的多目标对抗跨领域推荐[J]. 计算机与现代化, 2025, 0(01): 37-43.
[4]	闫晓奇, 彭逸清, 任小玲. 位置自适应卷积PointNet++的点云数据分类方法[J]. 计算机与现代化, 2025, 0(01): 44-49.
[5]	刘海涛, 冯帆. 基于BD格缩减辅助的连续干扰消除检测算法[J]. 计算机与现代化, 2025, 0(01): 67-73.
[6]	郑久超, 赵新元. 基于主题与描述信息的实体链接方法[J]. 计算机与现代化, 2024, 0(12): 10-14.
[7]	张昆1, 张永伟1, 吴永城1, 张笑文2, 翟世臣2. 基于大模型的设备故障知识图谱自动构建方法[J]. 计算机与现代化, 2024, 0(11): 46-53.
[8]	张宇1, 2, 黎靖1, 2, 马铭1, 2, 王众祥1, 2, 孙妍1, 2. YOLOLW:一个新的轻量级目标检测模型[J]. 计算机与现代化, 2024, 0(11): 91-98.
[9]	杜猛俊1, 李昂1, 童俊1, 钱锦1, 康恺1, 王若丁1, 靳文星2. 基于改进极限学习算法的电力信息数据融合模型[J]. 计算机与现代化, 2024, 0(10): 61-64.
[10]	焦一凯1, 2, 朱欣娟1, 2. 公共文化资源标签推荐方法[J]. 计算机与现代化, 2024, 0(10): 107-112.
[11]	杨俞沣1, 2, 夏小云2, 陈泽丰3, 廖伟志2, 李积武2. 融合多策略蜣螂优化算法的外卖订单配送路径优化[J]. 计算机与现代化, 2024, 0(09): 25-32.
[12]	马钰, 杨勇, 任鸽, 帕力旦·吐尔逊. 基于GCN和微调BERT的作文自动评分方法[J]. 计算机与现代化, 2024, 0(09): 33-37.
[13]	刘文亮1, 吴飞1, 何德明1, 赵维伟2, 潘建宏3. 基于相异度矩阵的碎片化回复文本聚类方法[J]. 计算机与现代化, 2024, 0(09): 56-60.
[14]	高猛, 曾宪文. 基于Circle映射和自适应t分布变异改进的鹈鹕优化算法[J]. 计算机与现代化, 2024, 0(09): 69-73.
[15]	余晨曦, 谷林. 基于人体骨架的电梯内异常行为识别预警[J]. 计算机与现代化, 2024, 0(09): 114-120.