Computer and Modernization ›› 2021, Vol. 0 ›› Issue (08): 70-76.

Previous Articles     Next Articles

Software Defect Prediction Based on Hybrid Sampling and Random_Stacking

  

  1. (College of Information Science & Technology, Qingdao University of Science and Technology, Qingdao 266061, China)  
  • Online:2021-08-19 Published:2021-08-19

Abstract: The existing software defect prediction methods  face problems such as imbalance of data categories, high-dimensional data processing, and so on. How to effectively solve the above problems has become a research hotspot in related fields. Aiming at the problems of unbalanced categories and low prediction accuracy faced by software defect prediction, this paper proposes a software defect prediction algorithm DP_HSRS based on hybrid sampling and Random_Stacking. The DP_HSRS algorithm firstly uses a hybrid sampling algorithm to balance the unbalanced data, then uses the Random_Stacking algorithm to predict software defects on the balanced data set. The Random_Stacking algorithm is an effective improvement to the traditional Stacking algorithm. It constructs multiple Stacking classifiers by fusing multiple classic classification algorithms and the Bagging mechanism, votes multiple Stacking classifiers to obtain an integrated classifier, and finally uses the integrated classifier to predict software defects. The results of experiments on the NASA MDP data set show that the performance of the DP_HSRS algorithm is better than the existing algorithms, and it has better defect prediction performance.

Key words: software defect prediction, data imbalance, mixed sampling, Random_Stacking, DP_HSRS