计算机与现代化

• 信息安全 • 上一篇    下一篇

分布式计算环境下的入侵检测数据分类研究

  

  1. 1.常州工学院计算机信息工程学院,江苏常州213002;2.常州工学院经济管理学院,江苏常州213002
  • 收稿日期:2015-08-05 出版日期:2015-12-23 发布日期:2015-12-30
  • 作者简介:沈利香(1977-),女,江苏海门人,常州工学院计算机信息工程学院讲师,硕士,研究方向:网络安全,数据挖掘; 曹国(1975-),男,安徽舒城人,常州工学院经济管理学院讲师,硕士,研究方向:数据挖掘。
  • 基金资助:
    教育部人文社会科学研究青年基金资助项目(11YJCZH005)

Intrusion Detection Data Classification by Distributed Computing

  1. 1. School of Computer & Information Engineering, Changzhou Institute of Technology, Changzhou 213002, China;

    2. School of Economics and Management, Changzhou Institute of Technology, Changzhou 213002, China
  • Received:2015-08-05 Online:2015-12-23 Published:2015-12-30

摘要: 为了有效处理迅速增长的海量信息数据安全问题,在Hadoop云计算平台上,应用朴素贝叶斯算法和Logistic回归算法对入侵检测大数据进行并行计算分析。实验在伪分布模式和分布模式下进行计算,结果表明2种算法分类准确率均超过90%,Logistic回归算法比朴素贝叶斯算法运行时间更长;集群环境下运行的朴素贝叶斯算法可以有效降低运行时间。综合算法运行时间和分类准确率等因素,朴素贝叶斯算法比Logistic回归算法更能有效处理入侵检测大数据;并行计算下朴素贝叶斯算法可以有效分析入侵检测大数据。

关键词: 入侵检测, 朴素贝叶斯, Logistic回归

Abstract: To handle huge amounts of network data effectively which is increasing rapidly, Naive Bayesian parallel algorithm and Logistic Regression parallel algorithm were used to analyze the intrusion detection big data based on Hadoop which is a cloud computing system. The intrusion detection data was computed in the model of pseudodistribution model and distribution model. The experimental results show that the classification accuracy of the two algorithms can exceed 90% and Logistic Regression algorithm spent more time than Naive Bayesian algorithm. Naive Bayesian algorithm can reduce run time effectively in Hadoop cluster. So Naive Bayesian algorithm is more effectively than Logistic Regression algorithm with the classification accuracy and the algorithm running time considered. Naive Bayesian algorithm can analyze the intrusion detection big data.

Key words: intrusion detection, Naive Bayesian, Logistic Regression

中图分类号: