计算机与现代化

• 信息安全 • 上一篇    

基于隐马尔科夫模型的网络爬虫检测算法仿真

  

  1. 武汉邮电科学研究院烽火科技学院,湖北武汉430074
  • 收稿日期:2016-08-30 出版日期:2017-04-20 发布日期:2017-05-08
  • 作者简介:琚兴空(1991-),男,湖北武汉人,武汉邮电科学研究院烽火科技学院硕士研究生,研究方向:网络安全。

Simulation of Web Crawler Detection Algorithm Based on Hidden Markov Model

  1. College of Fenghuo, Wuhan Research Institute of Posts and Telecommunications, Wuhan 430074, China
  • Received:2016-08-30 Online:2017-04-20 Published:2017-05-08

摘要: 在网站的建设与维护中,为了提升服务器效率,加强安全保密性等原因需要区分普通用户和网络爬虫程序。但是一些不完善或恶意的设计使得针对爬虫程序的检测变得困难,这些爬虫程序不仅加重网站的负担,也危害了网络的安全。为了解决这一问题,本文提出一种利用行为模式进行检测的技术,采用隐马尔科夫模型描述行为模式,并使用Matlab仿真实现高精度的检测效果。结果表明,利用隐马尔科夫模型的检测技术可以实现高精确度和低错误率的网络爬虫检测。

关键词: 网络爬虫检测, 行为模式, 隐马尔科夫模型, 网络安全

Abstract: In the construction and maintenance process of the website, in order to improve server efficiency, strengthen security and confidentiality, developers need to distinguish between human users and Web crawlers. However, some inappropriate or malicious designs make it difficult to detect crawlers. These crawlers not only increase the burden on the site, but also endanger the security of network. In order to solve the problem that it is difficult to detect crawlers, a detection algorithm based on behavior pattern is proposed, which uses hidden Markov model to describe the behavior patterns of different clients and uses Matlab simulation to achieve a highly accurate detection result. The simulation results show that the detection technology of hidden Markov model can detect Web crawler with high accuracy and low error rate.

Key words:  Web crawler detection, behavior pattern, hidden Markov model, network security

中图分类号: