Computer and Modernization

Previous Articles     Next Articles

Classifier Ensemble for Imbalanced Data Stream Classification Based on Accumulated Minorities

  

  1. 1. School of Computer Science, Northwestern Polytechnical University, Xi’an 710129, China;
    2. School of Natural and Applied Sciences, Northwestern Polytechnical University, Xi’an 710129, China
  • Received:2014-12-08 Online:2015-03-23 Published:2015-03-26

Abstract:  To solve the issue of over-fitting and not making full use of current data in existing methods of balancing imbalanced data stream, a method named EAMIDS for imbalanced data stream is proposed based on accumulated positive samples. In EAMIDS, positive samples from previous training chunks are accumulated to form the AP set which is used to balance the class distributions by making use of K nearest neighbors and Over-sampling technique. The ensemble classifier will be updated according to F-Measure when the number of the available base classifiers is greater than the fixed size of the ensemble classifier. Empirical study on both SEA dataset and SPH dataset shows that the proposed EAMIDS has substantial advantage over IDSL approach and SMOTE approach in prediction accuracy.

Key words: imbalanced data streams, accumulated positive samples, ensemble classifiers, concept drift

CLC Number: