计算机与现代化 ›› 2020, Vol. 0 ›› Issue (12): 67-71.

• 数据库与数据挖掘 • 上一篇    下一篇

一种用于中医四诊分析的子空间聚类方法

  

  1. (1.常州市中医医院,江苏常州213003;2.金陵科技学院网络与通信工程学院,江苏南京211169;
    3.安徽理工大学计算机科学与工程学院,安徽淮南232001)
  • 出版日期:2021-01-07 发布日期:2021-01-07
  • 作者简介:许立辉(1977—),男,江苏常州人,工程师,硕士,研究方向:信息管理,数据挖掘,中医病证分析,E-mail: lwtg_xlh@126.com; 陈敏(1995—),男,江苏淮安人,硕士研究生,研究方向:人工智能,大数据技术,E-mail: 842679178@qq.com; 王池社(1974—),男,教授,博士,研究方向:大数据技术,人工智能,机器学习,E-mail: wyxcs@163.com。
  • 基金资助:
    国家自然科学基金资助项目(61375121)

 Subspace Clustering Method for Analysis of Four-diagnoses of Traditional Chinese Medicine

  1. (1. Changzhou Hospital of Traditional Chinese Medicine, Changzhou 213003, China;
    2. College of Network and Communication Engineering, Jinling Institute of Technology, Nanjing 211169, China;
    3. College of Computer Science and Engineering, Anhui University of Science and Technology, Huainan 232001, China)
  • Online:2021-01-07 Published:2021-01-07

摘要: 中医四诊分析是基于四诊信息进行中医证候分类研究的重要内容,构建有效的中医四诊分析模型可以更好地挖掘中医证候间的关联关系,从而为中医临床提供决策支持。本文通过对子空间聚类CLIQUE算法的分析,结合四诊信息的数据特征,提出一种基于限定空间搜索策略的改进CLIQUE算法(ChM-CLIQUE)。通过优化CLIQUE算法的搜索策略,以稠密单元中网格密度最大的单元为中心进行深度优先搜索生成聚类簇,提高算法的性能,同时基于聚类簇中样本高斯分布的特性引入网格自适应密度,增强聚类边界的识别精度。在中医临床采集的数据集上进行多组对比实验,实验结果表明本文算法的轮廓系数较CLIQUE算法有显著性的提高。

关键词: 四诊信息, 子空间聚类, CLIQUE算法

Abstract: The analysis of the four-diagnosis of traditional Chinese medicine is an important part of the analysis of TCM syndromes based on the information of the four-diagnosis. Constructing an effective analysis model of the four-diagnosis of TCM can better mine the correlation between TCM syndromes and provide decision support for the clinical of TCM. In this paper, through the analysis of the CLIQUE algorithm of subspace clustering, combined with the data characteristics of the four-diagnosis information, an improved CLIQUE algorithm (ChM-CLIQUE) based on limited space search strategy is proposed. By optimizing the search strategy of the CLIQUE algorithm and performing a depth-first search centered on the cell with the largest grid density among dense cells, the cluster clusters are generated to improve the performance of the algorithm, and introducing grid adaptation density based on the characteristics of the sample Gaussian distribution in the cluster clusters, the recognition accuracy of cluster boundaries is enhanced. In the experiment, multiple sets of comparative experiments were carried out on the data set collected in the clinical medicine of traditional Chinese medicine. The experimental results show that the contour coefficients of the algorithm in this paper are significantly improved by 12.6% and 19.3% respectively compared with the CLIQUE algorithm.

Key words: four-diagnosis analysis, subspace clustering, CLIQUE algorithm