Computer and Modernization

Previous Articles     Next Articles

Loan Risk Prediction Method Based on SMOTE and XGBoost

  

  1. (School of Electronic Information and Artificial Intelligence, Shaanxi University of Science & Technology, Xi’an 710021, China) 
  • Received:2019-06-11 Online:2020-03-03 Published:2020-03-03

Abstract: In recent years, the rapid development of online credit loan results in both continuous growth of total amount of loan and the continuous rise of probability of default. Therefore, it is of great practical significance for online credit enterprises to prevent the risk of Internet finance by studying the risk of loan. Aiming at loan-related problems including the non-balanced distribution, a large number of noise, and high dimension, a loan risk prediction method based on SMOTE and XGBoost is proposed. Through the feature engineering, the dimensionality reduction and denoising of the data set are realized. For the non-equilibrium problem of the data, the SMOTE algorithm is used to oversample the number of positive and negative samples. Based on above-mentioned work, this paper builds an XGBoost classification model, compares it with some traditional classification algorithms, and conducts comparison of validity of the prediction results under different positive and negative sample proportions. The experiment shows that XGBoost algorithm has better effect in loan risk prediction model in comparison with traditional classification models, and the increase of the proportion of minority samples through the use of SMOTE algorithm can improve the effectiveness of prediction results.

Key words: loan risk, feature engineering, SMOTE algorithm, XGBoost

CLC Number: