Computer and Modernization ›› 2023, Vol. 0 ›› Issue (11): 6-12.doi: 10.3969/j.issn.1006-2475.2023.11.002

Previous Articles     Next Articles

Cross-language Multi-label Sentiment Classification Based on Stacked Denoising AutoEncoder

  

  1. (1. Guangzhou Key Laboratory of Big Data and Intelligent Education, Guangzhou 510631, China;
    2. School of Computer Science, South China Normal University, Guangzhou 510631, China)
  • Online:2023-11-29 Published:2023-11-29

Abstract: Abstract: The multi-label sentiment classification task aims to deal with the problem that an instance may be associated with multiple sentiment labels. Most existing multi-label sentiment classification models were designed based on complete data,and their performance and sentiment were easily affected by the incompleteness of data itself. To address this problem,a cross-language multi-label sentiment classification model based on stacked denoising autoencoder is proposed, and a loss function is introduced to compensate for the loss caused by training. In this model, the word vectors are denoised by the stacked denoising autoencoder to construct the low-dimensional features of the original data. This reduces the noise interference in feature space and provides effective feature representation for downstream tasks. In the multi-label sentiment classification experiment of SemEval2018 three language datasets (English, Arabic and Spanish), the micro_F1 score, macro_F1 score and jaccard indexes of the model on the test set are all improved. Macro_F1 is improved by about 0.82, 1.45 and 1.83 percentage points, respectively.

Key words: Key words: multi-label classification, sentiment classification, incomplete data, BERT, stacked denoising autoencoder(SDAE)

CLC Number: