计算机与现代化

• 人工智能 • 上一篇    下一篇

融合差分进化和SOM的组合文本聚类算法

  

  1. (聊城大学东昌学院,山东 聊城 252000)
  • 收稿日期:2015-01-06 出版日期:2015-05-18 发布日期:2015-05-18
  • 作者简介:姜凯(1986-),男,山东胶州人,聊城大学东昌学院助教,硕士,研究方向:文本挖掘; 苑金海(1979-),男,讲师,硕士,研究方向:数据挖掘与人工智能。
  • 基金资助:
    山东省教育厅科研计划项目(J13LN75)

A Novel Assembled Text Clustering Algorithm Using Differential Evolution and SOM

  1. (Dongchang College of Liaocheng University, Liaocheng 252000, China)
  • Received:2015-01-06 Online:2015-05-18 Published:2015-05-18

摘要: 自组织映射算法是一种重要的聚类模型,能够有效提高搜索引擎的精确性。为克服自组织映射网络对于初始连接权值敏感的不足,提出一种改进的差分进化和SOM相结合的组合文档聚类算法IDE-SOM,首先引入一种改进的差分进化算法对文档集进行一次粗聚类,旨在对SOM网络的初始连接权值进行优化,然后将这个连接权值初始化SOM网络进行细聚类。仿真实验表明,该算法在F-measure、熵等评价指标上都获得了较好的聚类效果。

关键词: 改进差分进化算法, 自组织映射, 组合文本聚类

Abstract: Self-organizing map (SOM) is an important clustering model, which can effectively improve the accuracy of search engine. But it is sensitive to the initial connection weights. After analyzing the drawbacks of the self-organizing map algorithm, a novel assembled text clustering algorithm (IDE-SOM) based on improved differential evolution and self-organizing map is proposed. Firstly, the improved differential evolution is introduced to realize coarse clustering in the document feature set with the purpose of getting an optimized initial connection weights. Then the SOM algorithm is initialized to realize fine clustering using the initial connection weights. Finally, the experiment is conducted and the results illustrate the better clustering performance of the proposed hybrid approach in terms of the value of F-measure and entropy.

Key words: improved differential evolution algorithm, self-organizing map (SOM), assemble text clustering

中图分类号: