计算机与现代化 ›› 2013, Vol. 1 ›› Issue (9): 35-37,4.doi: 10.3969/j.issn.1006-2475.2013.09.008

• 人工智能 • 上一篇    下一篇

基于MapReduce的Web日志挖掘预处理

毛严奇,彭沛夫   

  1. 湖南师范大学物理与信息科学学院,湖南 长沙 410081
  • 收稿日期:2013-03-25 修回日期:1900-01-01 出版日期:2013-09-17 发布日期:2013-09-17

MapReduce-based Web Log Mining Preprocessing

MAO Yan-qi, PENG Pei-fu   

  1. School of Physics and Information Science, Hunan Normal University, Changsha 410081, China
  • Received:2013-03-25 Revised:1900-01-01 Online:2013-09-17 Published:2013-09-17

摘要: 介绍Web日志挖掘的一般性过程,并重点对Web日志挖掘预处理的Web会话划分进行研究。介绍云计算的概念与优点,针对目前Web日志挖掘的瓶颈以及数据的存储与共享的难点,提出基于MapReduce的Web日志挖掘预处理,能较好地解决Web日志挖掘当前所面临的效率问题,更好地整合计算机资源,减少不必要的资源浪费。

关键词: Web日志挖掘, MapReduce, 会话划分

Abstract: This article describes the general process of Web log mining and focuses on the study of Web session division in the preprocessing of Web log mining. The article introduces the concept and advantages of cloud computing. For the bottleneck analysis of Web log mining, as well as the difficulties of data storage and sharing, we proposed MapReduce-based Web log mining preprocessing, which can better solve the efficiency issues currently faced in Web log mining and better integrate the computer resources to reduce unnecessary waste.

Key words: Web log mining, MapReduce, session division

中图分类号: