基于Apriori的Web访问模式挖掘算法

doi:10.3969/j.issn.1006-2475.2013.10.001

计算机与现代化 ›› 2013, Vol. 218 ›› Issue (10): 1-5.doi: 10.3969/j.issn.1006-2475.2013.10.001

• 算法设计与分析 • 下一篇

基于Apriori的Web访问模式挖掘算法

刘美玲^1,2，苏毅娟^2,3

1.广西民族大学信息科学与工程学院，广西南宁530006;2.广西师范学院科学计算与智能信息处理广西高校重点实验室，广西南宁530023;3.广西师范学院计算机与信息工程学院，广西南宁530023

收稿日期:2013-05-24 修回日期:1900-01-01 出版日期:2013-10-26 发布日期:2013-10-26

Apriori-based Web Traversal Pattern Mining Algorithm

LIU Mei-ling ^1,2, SU Yi-juan ^2,3

1. College of Information Science and Engineering, Guangxi University for Nationalities, Nanning 530006, China;2. Key Laboratory of Science Computing and Intelligent Information Processing in Universities of Guangxi, Guangxi Teachers Education University, Nanning 530023, China;3. College of Computer and Information Engineering, Guangxi Teachers Education University, Nanning 530023, China

Received:2013-05-24 Revised:1900-01-01 Online:2013-10-26 Published:2013-10-26

摘要/Abstract

摘要： 简要介绍Apriori算法与Web访问路径的有向图表示方法，提出一种从Web日志文件中得到频繁访问模式的算法，该算法以Apriori算法为基础，并以访问路径序列的有序性特点作为候选集的剪枝策略，减少候选集的生成，提高算法的效率。在真实数据集和模拟数据集上分别进行实验，实验结果表明该算法是有效的，且适应性好。

关键词: WFTP算法, Web日志文件, 数据挖掘, 频繁访问路径, 有序访问路径

Abstract: The Apriori algorithm and the directed graph representation method for Web traversal paths are briefly introduced, and an algorithm based on Apriori is proposed for generating frequent traversal patterns from Web log files. The proposed algorithm uses the orderliness of the traversal paths as pruning strategy of candidate set, thus it can decrease the scale of candidate sets and improve efficiency. Some experiments are conducted with real datasets and simulated datasets, and the experimental results show the effectiveness and good adaptability of the proposed algorithm.

Key words: WFTP algorithm, Web log file, data mining, frequently traversed path, sequential traversed path

中图分类号:

TP301.6

刘美玲;苏毅娟;. 基于Apriori的Web访问模式挖掘算法[J]. 计算机与现代化, 2013, 218(10): 1-5.

LIU Mei-ling;SU Yi-juan;. Apriori-based Web Traversal Pattern Mining Algorithm[J]. Computer and Modernization, 2013, 218(10): 1-5.

[1]	袁红伟1, 常利军1, 郝家欢2, 樊娜2, 王超2, 罗闯2, 张泽辉2. 基于标签传播的轨迹兴趣点挖掘及隐私保护[J]. 计算机与现代化, 2024, 0(05): 46-54.
[2]	谢仕斌, 刘梦赤, 唐诗琪, 周瑞平, . 基于多特征提取的时间卷积知识追踪模型[J]. 计算机与现代化, 2023, 0(07): 25-29.
[3]	刘佩. 基于数据挖掘的医保控费系统[J]. 计算机与现代化, 2023, 0(06): 89-94.
[4]	王劭华, 欧阳会丹, 孙丹, 王康, 吴鸿萍, 钟询, 褚兴平, 杨松涛. 基于Apriori算法的大学生体测项目关联规则挖掘[J]. 计算机与现代化, 2023, 0(03): 66-70.
[5]	宋晓丽, 张勇波, 张培颖. 基于半监督学习的学生消费数据异常检测[J]. 计算机与现代化, 2022, 0(12): 13-17.
[6]	段桂芹, 邹臣嵩. 基于近邻传播聚类的职业能力评价模型[J]. 计算机与现代化, 2022, 0(05): 21-27.
[7]	杨琳, 白钊, 寇勇刚. 基于RFM模型的随机森林算法对民航客户的流失分析[J]. 计算机与现代化, 2021, 0(01): 100-104.
[8]	李科心, 李静, 邵佳炜, 肖屹. 多层次序列集成的高维数值型异常检测[J]. 计算机与现代化, 2020, 0(06): 73-.
[9]	蒋毅,欧郁强,梁广,高杨,严永高,林捷,赵晓宁. 基于数据挖掘的现场作业风险态势评估方法[J]. 计算机与现代化, 2020, 0(04): 78-.
[10]	齐玉东1，丁海强1，赵锦超2，孙明玮1. 基于biRNN的海军军械不均衡文本数据集处理方法[J]. 计算机与现代化, 2019, 0(12): 21-.
[11]	郭燚1，张卫山1，徐亮2，翟佳3. 基于微服务的石油大数据挖掘平台[J]. 计算机与现代化, 2019, 0(05): 25-.
[12]	李娜，毛国君，邓康立. 基于k-means聚类的股票KDJ类指标综合分析方法[J]. 计算机与现代化, 2018, 0(10): 12-.
[13]	田丽. 情报分析中提取主题信息核心要素的模型及方法[J]. 计算机与现代化, 2018, 0(10): 22-.
[14]	王永胜1,2，李晖1,2，陈梅1,2，戴震宇1,2，朱明3. VISDMiner：一个交互式数据挖掘过程可视化系统[J]. 计算机与现代化, 2018, 0(06): 72-.
[15]	字云飞,李业丽,孙华艳,张莉婧. 改进FP-Growth算法在旅游线路规划中的应用研究[J]. 计算机与现代化, 2018, 0(02): 17-.

基于Apriori的Web访问模式挖掘算法

Apriori-based Web Traversal Pattern Mining Algorithm

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价