计算机与现代化 ›› 2022, Vol. 0 ›› Issue (09): 111-118.

• 信息安全 • 上一篇    下一篇

基于改进混合采样和XGBoost算法的信用卡欺诈检测方法

  

  1. (1.江西省科技基础条件平台中心,江西南昌330003;2.中国广电江西网络有限公司,江西南昌330006)
  • 出版日期:2022-09-22 发布日期:2022-09-22
  • 作者简介:孙丹(1988—),女,江西泰和人,工程师,硕士,研究方向:信息系统安全,E-mail: 807965684@qq.com。
  • 基金资助:
    江西省科技计划项目(20194BBE50087); 江西省重点研发计划项目(20202BBEL53003, 20192ACB50028)

Credit Card Fraud Detection Method Based on Improved SMOTE+ENN and XGBoost Algorithm#br#

  1. (1. Jiangxi Science and Technology Infrastructure Center, Nanchang 330003, China;
    2. China Radio and Television Jiangxi Network Co. Ltd., Nanchang 330006, China)
  • Online:2022-09-22 Published:2022-09-22

摘要: 随着金融机构信用卡业务的快速发展,信用卡欺诈行为成为金融机构面临的严峻问题。针对金融机构信用卡数据分布不均衡问题,本文采用过采样、降采样、SMOTE+ENN、SMOTE+Tomeklin、改进的SMOTE+Tomeklin和改进的SMOTE+ENN混合采样这6种不同采样方法对不平衡数据进行平衡处理,然后将平衡数据集输入到多种分类算法模型中进行实验比对,最后提出一种基于改进的SMOTE+ENN混合采样和XGBoost算法的信用卡欺诈行为检测模型。通过5种评价指标验证该检测方法不仅提高了信用卡欺诈行为不平衡数据的区分度,同时提高了信用卡欺诈行为检测的准确性和可行性。

关键词: SMOTE+ENN, XGBoost算法, 不平衡数据, Credit Card Fraud Detection, 评价指标

Abstract: With the rapid development of the credit card business in financial institutions, the financial institutions have faced a serious problem in Credit Card Fraud. Aiming at the problem of the unbalanced distribution of the credit card data in the financial institutions, the paper adopts six ways such as the oversampling, the down sampling, the SMOTE+ENN, the SMOTE+Tomeklin, the improved SMOTE+Tomeklin and the improved SMOTE+ENN for processing the unbalanced data. At the same time, the processed six data sets are input into various classification algorithm models for experimental comparison. Then the balance data sets are input into a muilty-classification algorithm model to make experimental comparisons. Finally, a new Credit Card Fraud Detection model combining the improved SMOTE+ENN and XGBoost algorithm is proposed. The empirical results of five evaluation indicators show that the detection method not only improves the discrimination of unbalanced data of Credit Card Fraud, but also improves the accuracy and feasibility of Credit Card Fraud detection.

Key words: SMOTE+ENN, XGBoost, unbalanced data, Credit Card Fraud Detection, evaluation indicators