Computer and Modernization

Previous Articles     Next Articles

Improved K-medoids Algorithm Based on Similarity Calculation Formula

  

  1. (School of Computer Science and Technology, Qilu University of Technology(Shandong Academy of Sciences), Jinan 250353, China)
  • Received:2018-11-28 Online:2019-05-14 Published:2019-05-14

Abstract: In the traditional K-medoids clustering algorithm, similarity is generally measured only by distance. This metric is based on independent and identically distributed attributes of data objects. But most real data object attributes are associated. Therefore, this article introduces the non-independent and identical distribution calculation formula. The traditional distance calculation similarity method is replaced. At the same time, since the non-independent and identical distribution formulas are calculated according to the frequency of the attribute values, but numerical data are not sensitive to frequency, so, numerical data are clustered and replaced by attribute columns before the introduction of formulas. Experimental results show that this method can improve the clustering accuracy of algorithm.

Key words: clustering, PAM algorithm, similarity

CLC Number: