Computer and Modernization

Previous Articles     Next Articles

A Large Margin Nearest Neighbor Algorithm of Large-scale Text Classification

  

  1. 1. School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China;
    2. Guangxi Communication Planning and Design Co. Ltd.〖KG-*4〗, Nanning 530007, China
  • Received:2015-12-28 Online:2016-06-16 Published:2016-06-17

Abstract: The large margin nearest neighbor algorithm has strong learning ability and generalization ability, which is widely used in the field of classification. But it will sink into difficulties when the semidefinite programming(SDP) scale of the LMNN algorithm expands rapidly as the data increasing used to solve the large-scale text classification problem. To solve this problem, we introduced the Huber loss function, which divided the Semidefinite Optimization Model of LMNN algorithm into two low-level continuous optimization sub-models, and finally reduced the computation complexity of the algorithm and improved its efficiency. The experimental results on the classification data set of public opinion show that the precision of the proposed algorithm was improved 4.5%, and the classification time saved 47.1% compared with the traditional one. It also can prove that adopting the low-level decomposition reduction method to improve the performance of the LMNN algorithm is feasible and more suitable for large-scale text classification.

Key words: semidefinite programming, large margin nearest neighbor, Huber loss function, large-scale text classification, generalization

CLC Number: