计算机与现代化

• 人工智能 • 上一篇    下一篇

基于特征提取的网络热点事件挖掘算法

  

  1. (平顶山学院计算机科学与技术学院,河南 平顶山 467000)
  • 收稿日期:2014-12-23 出版日期:2015-05-18 发布日期:2015-05-18
  • 作者简介:李玮瑶(1982-),女,河南许昌人,平顶山学院计算机科学与技术学院讲师,硕士,研究方向:数据挖掘与算法; 赵凯(1982-),男,河南平顶山人,讲师,硕士,研究方向:向量机,工作流引擎。
  • 基金资助:
    河南省重点科技攻关项目(132102210443)

Network Hot Event Mining Algorithm Based on Feature Extraction

  1. (School of Computer Science and Technology, Pingdingshan University, Pingdingshan 467000, China)
  • Received:2014-12-23 Online:2015-05-18 Published:2015-05-18

摘要: 为有效从网络中挖掘出民众关注的热点事件和话题,提高数据分类能力、热点追踪和检测正确率,在分析目前采用非结构化传统挖掘算法所存在问题的基础上,提出一种基于结构化分割的挖掘算法。首先通过分析热点事件挖掘处理流程,设计一种对热点事件数据挖掘的半结构化特征提取算法,对半结构化数据进行特征分割,生成大量请求,进而得到热点事件数据的分配因子,从而提高挖掘性能。仿真结果表明,该算法运行效率较高,精度较好,具有较高的稳健性。

关键词: 网络热点事件, 数据挖掘, 半结构化数据, 特征分割

Abstract: For effectively mining the hot issues and topics concerned by people in network, improving the capabilities of data classification and the correct rate of hot tracking and detection, basing on analyzing the problems existing in the traditional unstructured mining algorithms used currently, we proposed a mining algorithm based on structured segmentation. First, by analyzing the hot events mining process, we designed a semi-structured features extraction algorithm for hot events data mining, in order to make feature segmentation for semi-structured data, generate a lot of requests, and then get hot event data allocation factors, thereby improve mining properties. Simulation results show that the algorithm is running with high efficiency, good accuracy and high robustness.

Key words: network hot event, data mining, semi-structured data, feature segmentation

中图分类号: