Computer and Modernization ›› 2021, Vol. 0 ›› Issue (12): 37-42.

Previous Articles     Next Articles

Weibo Tag Generation Algorithm Based on LDA and Word2vec

  

  1. (School of Computer Science, Jiangsu University of Science and Technology, Zhenjiang 212100, China)
  • Online:2021-12-24 Published:2021-12-24

Abstract: Aiming at the problem that the tag generation algorithm based on the traditional LDA topic model describes the user’s interest topics incompletely, a Weibo user’s tag generation algorithm TopicERP based on the topic embedding representation is proposed. Based on the LDA model, by introducing Word2vec word embedded model, the algorithm is to conduct a comprehensive description of interest subject to the customer, and to improve the matching degree calculation method. Firstly, LDA topic model was used to analyze the topics of users’ Weibo and generate the topics of users’ interest. Then, Word2vec word embedding model was used to transform the topic text into the topic vector, which was used to calculate the matching degree. Finally, it used cosine similarity and conditional probability of topic in the document, the matching degree between topic vector and candidate tag was calculated, and Top-Q candidate tag was selected as the target user’s tag. Experimental results on MicroPCU, a public Weibo data set, show that the algorithm has better overall performance than the algorithm based on the traditional LDA topic model, and the generated user tags can describe users’ interests and preferences more accurately.

Key words: tag generation, LDA, Word2vec, Weibo