Computer and Modernization

Previous Articles     Next Articles

An Outlier Detection Algorithm in Big Data Based on Improved KNN

  

  1. Smart Grid Research Institute, Electric Power Research Institute of China Southern Power Grid Co. Ltd, Guangzhou 510080, China
  • Received:2016-09-22 Online:2017-05-26 Published:2017-05-31

Abstract: Aiming at the two shortcomings of KNN algorithm in the field of large data outlier detection, high dimension data is difficult to deal with and time complexity is too high. A classification method based on AOR (Attribute Overlapping Rate) is proposed, and the KNN algorithm is improved. At first the data were reduced the dimension based on AOR, making data processing dimension great increase. Then the traditional KNN algorithm was improved by pruning, reducing lots of invalid computation. The experimental results show that this algorithm has a great improvement on the operational efficiency and accuracy of the large data samples with high dimension and large capacity.

Key words:  , big data; KNN; reduce dimension; attribute overlapping rate; pruning