Computer and Modernization ›› 2021, Vol. 0 ›› Issue (12): 116-122.

Previous Articles    

A Feature Filtering and Instance Transfer Framework for Cross-project Defect Prediction

  

  1. (1. Shanghai Electro-Mechanical Engineering Institute, Shanghai 201109, China;
    2. Shanghai Aerospace Electronic Technology Institute, Shanghai 201109, China)
  • Online:2021-12-24 Published:2021-12-24

Abstract: In cross-project software defect prediction, the feature correlation and the difference in instance distribution between the source project and the target project are the main factors that affect the performance of the prediction model. From the perspective of feature filtering and instance transfer, we propose a framework for cross-project defect prediction called KCF-KMM. Specifically, during the feature filtering phase, it uses K-medoids clustering algorithm to select features, filtering out features that have low relevance to the target project. During the instance transfer phase, the KMM algorithm is used to calculate the distribution difference between the source project and the target project instance, so as to assign the influence weight of each training instance. Finally, it combines a small amount of labeled data in the target project to establish a mixed defect prediction model. To verify the effectiveness of KCF-KMM, it is compared with the classic cross-project software defect prediction methods such as TCA+, TNB and NNFilter from the perspective of accuracy and F1 value. The prediction performance of KCF-KMM can be improved by 34.1%, 0.8%, 21.1% and 14.4%, 3.7%, 10.6% on the Apache data set, respectively.

Key words: source project, target project, feature correlation, distribution difference, feature filtering, instance transfer