计算机与现代化

• 人工智能 • 上一篇    下一篇

基于稀疏逻辑回归和多元融合算法的 #br# 慢性肾病进展预测模型

  

  1. (四川大学电子信息学院,四川成都610065)
  • 收稿日期:2018-12-12 出版日期:2019-04-08 发布日期:2019-04-10
  • 作者简介:杨金山(1992-),男,四川射洪人,硕士研究生,研究方向:医疗数据处理,E-mail: yjsscu549@163.com; 李智(1975-),男,四川成都人,教授,硕士生导师,研究方向:压缩感知,医学数据分析。

Progression Prediction Model of Chronic Kidney Disease Based on  #br# Sparse Logistic Regression and Multiple Ensemble Algorithm

  1. (College of Electronics and Information Engineering, Sichuan University, Chengdu 610065, China) 
  • Received:2018-12-12 Online:2019-04-08 Published:2019-04-10

摘要: 只有一部分慢性肾病(Chronic Kidney Disease, CKD)3期的患者会进展到4期,观察临床数据发现进展和非进展患者部分生理指标有较大的区别。本文首次将基于L1/2范数正则化的逻辑回归(Sparse Logistic Regression, SLR)用于筛选影响CKD患者进展的关键因素,然后利用SLR、支持向量机(SVM)、提升决策树(AdaBoost Decision Tree, BOOSTDT)建立进展风险预测模型。另外,本文引入堆叠算法Stacking(STKSSD)克服样本量不足使得模型泛化性能不稳定的缺陷。作为对比,本文分别利用神经网络(ANN)、循环神经网络(BLSTM)对数据建模。实验结果表明,当SLR算法选择磷、血清肌酐等11个关键特征时, STKSSD融合模型效果最好,其中测试查全率、查准率、F1值分别为86.97%、92.86%和89.82%。

关键词: SLR, Stacking融合算法, SVM, 提升决策树, BLSTM, ANN, 慢性肾病, 进展预测

Abstract: Only a subset of the patients with stage 3 Chronic Kidney Disease (CKD) progresses to stage 4. By observing the clinical data, there are significant differences in physiological indicators between progressive and non-progressive patients. Firstly, a sparse logistic regression (SLR) with L1/2 regularization is proposed, and it is used to select the key factors that influence the progression of CKD. Then, the progression prediction model is built by SLR, Support Vector Machine (SVM) and Adaboost Decision Tree (BOOSTDT). In addition, stacking algorithm (STKSSD) is introduced to overcome the shortcomings of unstable generalization performance due to lack of samples. Finally, Artificial Neural Network (ANN) and Bidirectional Long Short-Term Memory Networks (BLSTM) are used to model the data respectively. The experimental results show that when 11 key features such as phosphorous,serum creatinine,and so on are selected by the SLR, the STKSSD algorithm achieves the best performance and obtains 86.97% recall rate, 92.86% precision rate, and 89.82% F1-score.

Key words: SLR, STKSSD, SVM, BOOSTDT, BLSTM, ANN, chronic kidney disease, progression prediction

中图分类号: