Cross-language Multi-label Sentiment Classification Based on Stacked Denoising AutoEncoder

doi:10.3969/j.issn.1006-2475.2023.11.002

Abstract

Abstract: Abstract: The multi-label sentiment classification task aims to deal with the problem that an instance may be associated with multiple sentiment labels. Most existing multi-label sentiment classification models were designed based on complete data，and their performance and sentiment were easily affected by the incompleteness of data itself. To address this problem，a cross-language multi-label sentiment classification model based on stacked denoising autoencoder is proposed， and a loss function is introduced to compensate for the loss caused by training. In this model， the word vectors are denoised by the stacked denoising autoencoder to construct the low-dimensional features of the original data. This reduces the noise interference in feature space and provides effective feature representation for downstream tasks. In the multi-label sentiment classification experiment of SemEval2018 three language datasets （English， Arabic and Spanish）， the micro_F1 score， macro_F1 score and jaccard indexes of the model on the test set are all improved. Macro_F1 is improved by about 0.82， 1.45 and 1.83 percentage points， respectively.

Key words: Key words: multi-label classification, sentiment classification, incomplete data, BERT, stacked denoising autoencoder（SDAE）

CLC Number:

TP391

TANG Shi-qi, ZHOU Rui-ping, XIE Shi-bin, LIU Meng-chi, XIAO Wen, . Cross-language Multi-label Sentiment Classification Based on Stacked Denoising AutoEncoder[J]. Computer and Modernization, 2023, 0(11): 6-12.

References

［1］ SCHAPIRE R E，SINGER Y. Improved boosting algorithms using confidence-rated predictions［M］// Machine Learning. Kluwer Academic Publishers， 1999,37:297-336.
［2］ HE H H，XIA R. Joint binary neural network for multi-label learning with applications to emotion classification［C］// CCF International Conference on Natural Language Processing and Chinese Computing（NLPCC）. 2018:250-259.
［3］ CAMRAS L. Emotion: A psychoevolutionary synthesis by Robert Plutchik［J］. The American Journal of Psychology，1980,93（4）:751-753.
［4］ BAZIOTIS C，NIKOLAOS A，CHRONOPOULOU A，et al. NTUA-SLP at SemEval-2018 task 1: Predicting affective content in tweets with deep attentive RNNs and transfer learning［C］// Proceedings of the 12th International Workshop on Semantic Evaluation. 2018:245-255.
［5］ FEI H，ZHANG Y，REN Y F，et al. Latent emotion memory for multi-label emotion classification［C］// Proceedings of the AAAI Conference on Artificial Intelligence. 2020:7692-7699.
［6］ ALHUZALI H，ANANIADOU S. SpanEmo: Casting multi-label emotion classification as span-prediction［C］// Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics. 2021:1573-1584.
［7］ DEVLIN J，CHANG M，LEE K，et al. BERT：Pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of NAACL-HLT. 2019：4171-4186.
［8］ YEH C K，WU W C，KO W J，et al. Learning deep latent space for multi-label classification［C］// Proceedings of the AAAI Conference on Artificial Intelligence. 2017,31. DOI:10.1609/aaai.v31i1.10769.
［9］ PANKO R R. Thinking is bad：Implications of human error research for spreadsheet research and practice［C］// Proceedings of European Spreadsheet Risks Interest Group. 2007:69-80.
［10］ DERIU J，LUCCHI A，DE LUCA V，et al. Leveraging large amounts of weakly supervised data for multi-language sentiment classification［C］// Proceedings of the 26th International Conference on World Wide Web. 2017:1045-1052.
［11］ VINCENT P，LAROCHELLE H，LAJOIE I，et al. Stacking denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion［J］. Journal of Machine Learning Research，2010，110（11）：3371-3408.
［12］ SERGIO G C，LEE M. Stacked DeBERT: All attention in incomplete data for text classification［J］. Neural Networks. 2021，136:87-96.
［13］罗俊,陈黎飞. 基于BERT的不完全数据情感分类［J］. 计算机应用，2021,41（1）:139-144.
［14］ VINCENT P，LAROCHELLE H，BENGIO Y，et al. Extracting and composing robust features with denoising autoencoders［C］// Proceedings of the 25th International Conference on Machine learning. 2008:1096-1103.
［15］ MIKOLOV T，CHEN K，CORRADO G，et al. Efficient estimation of word representations in vector space［J］. arXiv preprint arXiv:1301.3781，2013.
［16］ PETERS M E，NEUMANN M，LAYYER M，et al. Deep contextualized word representations［C］// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics：Human Language Technologies. 2018:2227-2237.
［17］ BAZIOTIS C，PELEKIS N，DOULKERIDIS C. DataStories at SemEval-2017 task 4: Deep LSTM with attention for message-level and topic-based sentiment analysis［C］// Proceedings of the 11th International Workshop on Semantic Evaluation （SemEval-2017）. 2017:747-754.
［18］ MOHAMMAD S，BRAVO-MARQUEZ F，SALAMEH M，et al. SemEval-2018 task 1: Affect in tweets［C］// Proceedings of the 12th International Workshop on Semantic Evaluation. 2018. DOI: 10.18653/v1/S18-1001.
［19］ YU J F，MARUJO L，JIANG J，et al. Improving multi-label emotion classification via sentiment classification with dual attention transfer network［C］// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018:1097-1102.
［20］ ZHOU D Y，YANG Y，HE Y L. Relevant emotion ranking from text constrained with emotion relationships［C］// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018:561-571.
［21］ VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［C］// Advances in Neural Information Processing Systems. 2011:5998-6008.
［22］ YING W H，XIANG R，LU Q. Improving multi-label emotion classification by integrating both general and domain-specific knowledge［C］// Proceedings of the 5th Workshop on Noisy User-generated Text （W-NUT）. 2019:316-321.
［23］ XU P，LIU Z H，WINATA G I，et al. Emograph: Capturing emotion correlations using graph networks［J］. arXiv preprint arXiv:2008.09378，2020.
［24］ BADARO G，EL JUNDI O，KHADDAJ A，et al. EMA at semeval-2018 task 1: Emotion mining for Arabic［C］// Proceedings of the 12th International Workshop on Semantic Evaluation. 2018:236-244.
［25］ MULKI H，ALI C B，HADDAD H，et al. Tw-StAR at semeval-2018 task 1: Preprocessing impact on multi-label emotion classification［C］// Proceedings of the 12th International Workshop on Semantic Evaluation. 2018:167-171.
［26］ ALSWAIDAN N，MENAI M E B. Hybrid feature model for emotion recognition in arabic text［J］. IEEE Access. 2020,8:37843-37854.
［27］ GONZALEZ J A，HURTADO L F，PLA F. ELiRF-UPV at semeval-2018 tasks 1 and 3: Affect and irony detection in tweets［C］// Proceedings of the 12th International Workshop on Semantic Evaluation. 2018:565-569.

[1]	ZHENG Jiuchao, ZHAO Xinyuan. Entity Linking Method Based on Topics and Description Information [J]. Computer and Modernization, 2024, 0(12): 10-14.
[2]	MA Yu, YANG Yong, REN Ge, Palidan Tuerxun. Automated Essay Scoring Method Based on GCN and Fine Tuned BERT [J]. Computer and Modernization, 2024, 0(09): 33-37.
[3]	ZHAO Dun1, SHE Xuebing2, WU Changxing3. Named Entity Recognition in Field of Party Building Based on BERT-BiLSTM-CRF [J]. Computer and Modernization, 2024, 0(09): 91-94.
[4]	ZHENG Li-rui, XIAO Xiao-xia, ZOU Bei-ji, LIU Bin, ZHOU Zhan. Named Entity Recognition in Electronic Medical Record Based on BERT [J]. Computer and Modernization, 2024, 0(01): 87-91.
[5]	LIU Yu-peng, GE Yan, DU Jun-wei, CHEN Zhuo. Joint Extraction Method of Entities and Relations Based on FGM and Pointer Annotation [J]. Computer and Modernization, 2023, 0(11): 1-5.
[6]	LI Shi-yue, MENG Jia-na, YU Yu-hai, LI Xue-ying, XU Ying-ao. Aspect Based Sentiment Analysis Model Based on Knowledge Enhancement [J]. Computer and Modernization, 2023, 0(10): 1-8.
[7]	WANG Hong-jie, XU Sheng-chao. Clustering Method of Cloud Platform Abnormal Transmission Data Based on Hilbert Similarity [J]. Computer and Modernization, 2023, 0(09): 27-31.
[8]	XIE Shi-chao, HUANG Wei, REN Xiang-hui. A Text Entity Linking Method Based on BERT [J]. Computer and Modernization, 2023, 0(02): 58-61.
[9]	ZHU Ya-jun, Yong Tso, Nyima Tashi, . Tibetan Medical Entity Recognition Based on Tibetan BERT [J]. Computer and Modernization, 2023, 0(01): 43-48.
[10]	YU Qing, MA Zhi-long, XU Chun. Medical Knowledge Extraction Based on BERT and Non-autoregressive [J]. Computer and Modernization, 2023, 0(01): 120-126.
[11]	HUANG Zhong-xiang, LI Ming. Text Classification Based on ALBERT Combined with Bidirectional Network [J]. Computer and Modernization, 2022, 0(10): 8-12.
[12]	ZHNAG Jun, QIU Long-long. A Text Classification Model Based on BERT and Pooling Operation [J]. Computer and Modernization, 2022, 0(06): 1-7.
[13]	CHEN Gang. Government Hotline Work-order Classification Fusing RoBERTa and Feature Extraction [J]. Computer and Modernization, 2022, 0(06): 21-26.
[14]	FAN Hai-wei, QIN Jia-jie, SUN Huan, ZHANG Li-miao, LU Xin-siyu. Traffic Accident Text Information Extraction Model Based on BERT and BiGRU-CRF Fusion [J]. Computer and Modernization, 2022, 0(05): 10-15.
[15]	GUO Tian-yu, YAN Rong-guo, FANG Xu-chen, XU Yu-ling, TAO Zheng-yi. Detection of R Wave Based on Hilbert Transform and Adaptive Threshold [J]. Computer and Modernization, 2022, 0(02): 114-119.