Computer and Modernization

Previous Articles     Next Articles

A Keyword Extraction Algorithm Based on Adaptive Related Entropy

  

  1. (1.State Grid Hunan Power Supply Service Center(Metrology Center), Changsha 410004, China;
    2.Changde Power Supply Branch Company, State Grid Hunan Power Co. Ltd., Changde 415000, China)
  • Received:2019-06-16 Online:2020-04-22 Published:2020-04-24

Abstract: Compared with the traditional technique of keyword extraction based on vocabulary frequency size, the TexRank algorithm can consider the similarity information between vocabulary nodes, but ignores vocabulary context information and the semantic structure of the article. On the basis of the weighted iteration of node diagram, this paper uses the association rule information of text context vocabulary, introduces the concept of association entropy, adaptively adjusts damping coefficient and sliding window size. It is closer to the actual semantic situation of text vocabulary, and can better deal with low word frequency and new vocabulary information. Experimental result shows that compared with TFIDF and TR algorithm, this method can achieve more accurate results when processing keyword extraction.

Key words: text mining, keyword extraction, association rules, TextRank, node

CLC Number: