Computer and Modernization

Previous Articles     Next Articles

A Text Classification Method for Weak Labeling

  

  1. School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
  • Received:2015-09-02 Online:2016-01-22 Published:2016-01-26

Abstract: Multi-label learning is different from traditional supervised learning. It is a framework which is proposed to represent objects which might have multiple semantic meanings simultaneously in the external world. Under this framework, an instance might be associated with a set of labels. The majority of the existing multi-label learning algorithms assume that each label set corresponding to the example is complete. However, the label sets associated with some examples may he incomplete. To deal with this problem, we propose a text classification method for weak labeling. The method tries to replenish missing labels by constructing an optimization problem, which is based on the assumptions that correlations between different labels are different and similar instances may have similar labels. Extensive experiments show that the proposed method can effectively improve the generalization performance of the learning system.

Key words: weak labeling, document classification, multi-label learning, machine learning, data mining

CLC Number: