Computer and Modernization ›› 2021, Vol. 0 ›› Issue (10): 23-28.

Previous Articles     Next Articles

Track Data Hot Spot Mining Algorithm Based on K-means

  

  1. (1. College of Information Science and Technology, Qingdao University of Science & Technology, Qingdao 266061, China;
    2. College of Information Science and Engineering, Ocean University of China, Qingdao 266100, China;
    3. College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou 325000, China)
  • Online:2021-10-14 Published:2021-10-14

Abstract: In view of the characteristics of time series and large quantity of fishing boat trajectory data, this paper proposes a trajectory hot spot mining algorithm, which overcomes the disadvantage that K-means algorithm cannot capture hot spot distribution in fishing boat trajectory data. The main idea is as follows: firstly, time dimension is used to process the data, and based on confidence and KL divergence to measure the reliability and correctness of the selected data, data with high information content is selected from a large number of trajectory data, and then the K-means clustering algorithm is used to cluster the processed data. The algorithm proposed in this paper only needs to set the significant level parameter a and time interval T, the algorithm itself can independently complete the data selection and the calculation of the confidence, KL divergence by using the method of time dimension data processing, and the clustering validity measure method is introduced to realize the whole process of hot spot mining by self-searching K value of K-means. The comparison test between the proposed algorithm and K-means algorithm and the reference test of data heat map are carried out on the trajectory data of fishing boats. The results show that the proposed algorithm is superior and correct in finding hot spots of trajectory data.

Key words: significant level a, KL divergence, time dimension, cluster validity measurement, track hot