Computer and Modernization ›› 2025, Vol. 0 ›› Issue (07): 83-89.doi: 10.3969/j.issn.1006-2475.2025.07.012

Previous Articles     Next Articles

Multi-modal Emotion Recognition Based on Text Guidance

  

  1. (School of  Computer Science, Xi’an Polytechnic University, Xi’an 710600, China)
  • Online:2025-07-22 Published:2025-07-22
  • About author:

Abstract:
Abstract: Multi-modal emotion recognition has been widely used in artificial intelligence, safe driving and other fields. Multi-modal information has rich modal representation, which is more accurate for emotion recognition. Text is a mode that expresses rich and accurate information. This paper proposes a multi-modal sentiment analysis model guided by multi-scale text features, that is, text features of different scales are aggregated to optimize the features of other modes. A text-guided multi-modal aggregation module AGG is designed, and the idea of contrast learning is introduced into the design of loss function to optimize the whole network. Each experimental index shows that the model has excellent performance in multi-modal emotion recognition, and the rationality and validity of the design are further proved by comparison experiment and ablation experiment.

Key words: Key words: emotion recognition, multi-modal, multi-scale text features, multi-modal fusion

CLC Number: