计算机与现代化 ›› 2016, Vol. 251 ›› Issue (07): 95-97,102.doi: 10.3969/j.issn.1006-2475.2016.07.019

• 网络与通信 • 上一篇    下一篇

 基于SVM和TF-IDF的恶意URL识别分析与研究

  

  1. 广州城建职业学院,广东广州510925
  • 收稿日期:2015-12-29 出版日期:2016-07-21 发布日期:2016-07-22
  • 作者简介: 甘宏(1976-),男,江西南昌人,广州城建职业学院副教授,南京大学博士研究生,研究方向:信息安全与云计算技术应用; 潘丹(1980-),女,广东广州人,讲师,硕士,研究方向:大数据技术与数据库应用。
  • 基金资助:
     国家自然科学基金资助项目(61272067); 广东省自然科学基金团队研究资助项目(S2012030006242)

 Analysis and Research of Malicious URL Recognition Based on SVM and TF-IDF

  1. Guangzhou City Construction College, Guangzhou 510925, China
  • Received:2015-12-29 Online:2016-07-21 Published:2016-07-22

摘要: 随着互联网尤其是移动互联网的快速发展,全球范围内出现了越来越多带欺诈和破坏性质的站点。本文通过分析URL的文本特征和站点特征,提出一种基于机器学习的URL检测方案,用TF-IDF算法细化了URL的站点特征,并结合以上特征使用基于RBF核的SVM进行URL安全检测,得到了96%的准确率和0.95的F1分数。

关键词:  , 网络安全, URL检测, TF-IDF, SVM

Abstract:  With the rapid development of the Internet, especially the mobile Internet, there are more and more sites that have been brought out and destroyed in the world. In this paper, we propose a URL detection scheme based on machine learning, through analyzing the features of URL’s text and sites. The URL’s site feature is refined by TF-IDF algorithm, the URL security detection is carried out with SVM kernel based on RBF kernel, and it obtained 96% auuracy and 0.95 F1 sore.

Key words: network security, URL detection, TF-IDF, SVM