计算机与现代化

• 算法设计与分析 • 上一篇    下一篇

动车组故障关联规则挖掘优化算法研究与应用

  

  1. 北京交通大学计算机与信息技术学院,北京  100044
  • 收稿日期:2016-12-15 出版日期:2017-09-20 发布日期:2017-09-19
  • 基金资助:
    国家高技术研究发展计划基金资助项目(2015AA043701); 中国铁路总公司科技研究开发计划项目(2015J006-C)

esearch and Application on Association Rule Mining Optimization Algorithm for High Speed EMU Malfunction

  1. School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
  • Received:2016-12-15 Online:2017-09-20 Published:2017-09-19

摘要: 动车组作为一种高速、安全的铁路客运设备,在运营过程中难免发生故障。针对动车组故障数据量巨大、价值密度低的特点,设计一种基于DHP算法的关联规则挖掘优化算法。采用再哈希技术解决DHP算法中的哈希冲突,提出RDHP算法。本算法可以百分百地过滤非频繁项集,无需额外的数据库扫描。为了进一步提高算法的效率,基于MapReduce编程思想,提出MR-RDHP算法,把海量动车组故障数据关联规则挖掘任务分解到集群中的多台计算机上并行处理。实验表明,MR-RDHP算法具有很好的时间性能,且挖掘出的规则可以有效指导动车组运行维修。

关键词: 动车组, 关联规则挖掘, DHP算法, 再哈希, MapReduce

Abstract: As high-speed and safe railway transportation, EMU is hard to avoid malfunction. In the view of the characteristics of EMU malfunction data—huge volume and low value density, we design an improved association rules mining algorithm based on the DHP algorithm. We adopt rehashing technique to solve the hash collision in the DHP algorithm and propose the RDHP algorithm, which can filter out all the infrequent item sets without additional database scanning. In order to enhance the efficiency of the algorithm, we propose the MR-RDHP algorithm based on the MapReduce programming framework. It decomposes the task of EMU malfunction data association rule mining to massive parallel processing on multiple computers in the cluster. The experimental result shows that the MR-RDHP algorithm has a good time performance and the rules dug out can be effectively used to guide EMU operation and maintenance.

Key words: EMU, association rules mining, DHP algorithm, rehashing, MapReduce

中图分类号: