Computer and Modernization

Previous Articles     Next Articles

A News Hot Spot Detection Method Based on Semantic Analysis

  

  1. (School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China)
  • Received:2016-11-16 Online:2017-06-23 Published:2017-06-23

Abstract: With the development and popularization of the Internet, Internet news reports are the main means for people to get social information. How to get the hot topic of Internet news quickly and accurately is an urgent problem to be solved. This paper uses the theme model of LDA (Latent Dirichlet Allocation) and BTM (Biterm Topic Model), fully considering the different impacts of news headlines and news content on news hot spot detection, to make the semantic analysis of news content and title respectively. By using the BTM model for news headlines and the LDA model for news content, we extract the feature vectors of the topic and combine the two semantic features to form the semantic feature of the whole text. Then, through improved clustering algorithm, the number of documents belonging to each topic is calculated. On this basis, by defining the news heat and using the news heat formula, the news heat is calculated to get the most recent hot news through ordering the news heat values. Through the experiments on the crawling news data, the validity and practicability of the method are verified.

Key words: latent semantic analysis, news heat, topic detection, LDA and BTM model

CLC Number: