计算机与现代化

• 人工智能 • 上一篇    下一篇

一种基于旋转森林的甲状腺疾病分类方法

  

  1. (东华大学计算机科学与技术学院,上海 201620)
  • 收稿日期:2015-11-13 出版日期:2016-03-17 发布日期:2016-03-17
  • 作者简介:潘乔(1977-),男,陕西汉中人,东华大学计算机科学与技术学院副教授,博士,研究方向:数据挖掘,网络性能分析; 许腾(1990-),男,江苏徐州人,硕士研究生,研究方向:数据挖掘; 陈德华(1976-),男,福建福州人,副教授,博士,研究方向:数据库,数据仓库与智慧医疗; 徐光伟(1969-),男,湖南衡阳人,副教授,博士,研究方向:无线传感网络。
  • 基金资助:
    上海市自然科学基金资助项目(15ZR1400900); 上海市科委科技创新行动计划项目(13511504905)

A Classification Method of Thyroid Disease Based on Rotation Forest

  1. (School of Computer Science and Technology, Donghua University, Shanghai 201620, China)
  • Received:2015-11-13 Online:2016-03-17 Published:2016-03-17

摘要: 甲状腺疾病是内分泌领域的常见疾病,准确识别不同类型的甲状腺疾病是临床医疗诊断中的首要问题。针对甲状腺检测指标数据,提出一种新的甲状腺疾病分类方法,该方法首先采用主成分分析法对数据集进行特征选择,降低数据维度,然后基于旋转森林集成分类算法实现分类。旋转森林算法使基分类器的差异性更加明显,进而提高分类器的精度,同时可以减少处理时间。实验中,同时分析了UCI标准数据集和真实临床医疗数据集,结果表明该方法的分类准确率分别可以达到96.28%和96.37%。

关键词: 甲状腺疾病, 集成分类, 旋转森林, 特征选择, 主成分分析

Abstract: Thyroid disease is common in the field of endocrine, accurate identification of different types of thyroid disease is the primary problem of clinical treatment. By using the results of clinical experiments, this paper presents a new method for thyroid disease classification. The method uses principal component analysis to reduce data dimension, and then implements classification task based on rotation forest algorithm. Rotation forest algorithm can make the difference between the base classifiers more obvious, and then improve the accuracy of the classifier, and it can reduce the processing time at the same time. Experimental results show that the classification accuracy of this method can reach to 96.28% on the dataset from UCI machine learning repository. In order to verify the effectiveness of the method furthermore, this paper also chooses the real clinical medical data set, it is more complex than the UCI standard dataset in data quantity and data dimension. Compared with the other method, the classification accuracy of this method reaches to 96.37%.

Key words: thyroid disease, ensemble classification, rotation forest, feature selection, principal component analysis

中图分类号: