计算机与现代化

• 人工智能 • 上一篇    下一篇

带负项值的onshelf效用项集并行挖掘算法

  

  1. (福州大学数学与计算机科学学院,福建福州350116)
  • 出版日期:2018-04-28 发布日期:2018-05-02
  • 作者简介:陈丽娟(1993),女,福建莆田人,福州大学数学与计算机科学学院硕士研究生,研究方向:数据挖掘; 通信作者:谢伙生(1964),男,副教授,硕士,研究方向:数据挖掘,图形图像处理。
  • 基金资助:
    福建省自然科学基金资助项目(2014J01229)

A Parallel Algorithm for Mining onshelf Utility Itemset with Negative Item Values

  1. (College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350116, China)
  • Online:2018-04-28 Published:2018-05-02

摘要: 为了提高带负项值的onshelf效用项集挖掘算法的挖掘效率,提出带负项值的onshelf效用项集并行挖掘算法DTPHoun,算法基于MapReduce框架,充分利用其onshelf时间段因素,将原始事务数据库按照时间段进行分片。算法将挖掘过程转化为MapReduce工作,Map阶段在分片数据库中挖掘候选项集,Reduce阶段并行计算候选项集的onshelf效用值。实验结果表明,算法取得了较高的挖掘效率。

关键词:  , 效用项集挖掘, onshelf时间段, MapReduce, 负项值

Abstract: In order to improve the mining efficiency of the onshelf utility itemset mining algorithms with negative item values, the paper proposed a parallel algorithm for mining onshelf utility itemset with negative item values named DTPHoun (distributed TPHoun algorithm). Based on MapReduce,  the algorithm divides the database according to the onshelf time periods. The algorithm transforms the mining work into MapReduce job, the Map phase to mine candidates in database fragments, and the Reduce phase to calculate the onshelf utility values of the candidates in parallel. The experimental results show that the DTPHoun algorithm has a good performance.

Key words: utility itemset mining, onshelf time periods, MapReduce, negative item values

中图分类号: