计算机与现代化 ›› 2025, Vol. 0 ›› Issue (12): 61-65.doi: 10.3969/j.issn.1006-2475.2025.12.009

• 算法设计与分析 • 上一篇    下一篇

对比学习的恶意加密流量检测方法

  


  1. (广东工业大学计算机学院,广东 广州 510006)
  • 出版日期:2025-12-18 发布日期:2025-12-18
  • 作者简介: 作者简介:吴佳宏(2000—),男,广东潮州人,硕士研究生,研究方向:恶意流量检测,E-mail: gzwusir627@163.com。
  • 基金资助:
    基金项目:广州市重点领域研发计划项目(202007010004)
      

Contrastive Learning Method for Detecting Malicious Encrypted Traffic 


  1. (School of Computer, Guangdong University of Technology, Guangzhou 510006, China)
  • Online:2025-12-18 Published:2025-12-18

摘要: 摘要:针对恶意加密流量检测模型表征能力不足的问题,提出一种基于对比学习的恶意加密流量检测方法,旨在提高模型的表征能力从而提高恶意加密流量的检测精度。与传统方法直接从流量数据中提取流量特征不同,本文方法侧重于学习数据的内在表示后再进行特征提取。具体来说,首先利用多尺度机制提取加密流量的局部与全局特征,以捕捉不同尺度下的关键信息。然后,在对比学习的度量空间中,通过优化目标函数,缩小加密流量与正确分类标签的距离,同时增加与错误分类标签的距离,使得模型能够更好地区分恶意与正常的加密流量。经过训练,模型捕获到更具区分度的加密流量特征,最终提高检测精度。实验数据集由UNSW NS 2019、CICIDS-2017、CIC-AndMal 2017、Malware Capture Facility Project Dataset和CICIDS-2012等多个公共数据集采样组合而成。实验结果表明,该方法较对比模型表现出更优的性能,检测精度达到97.59%,相较随机森林的基准模型,提高了3.16百分点。此外,该方法在可解释性与检测速率上也有提升。


关键词: 关键词:加密流量, 恶意流量, 深度学习, 对比学习, 多尺度特征

Abstract: Abstract: To address the issue of insufficient representation capability in malicious encrypted traffic detection models, a malicious encrypted traffic detection method based on contrastive learning is proposed, with the goal of enhancing the model’s representation ability and thereby improving the detection accuracy of malicious encrypted traffic. This method diverges from traditional approaches that directly extract features from traffic data, focusing instead on learning the intrinsic representations of the data prior to feature extraction. Specifically, local and global features of encrypted traffic are extracted using a multi-scale mechanism to capture key information at different scales. Then, in the metric space of contrastive learning, the distance between encrypted traffic and the correct classification label is minimized, while the distance from the incorrect classification label is maximized by optimizing the objective function, enabling the model to better distinguish between malicious and normal encrypted traffic. After training, the model captures more discriminative features of encrypted traffic, ultimately improving detection accuracy. The experimental dataset is composed of sampling from multiple public datasets including UNSW NS 2019, CICIDS-2017, CIC-AndMal 2017, Malware Capture Facility Project Dataset, and CICIDS-2012. The results show that the method achieves 97.59% detection accuracy, exceeding comparative models, with 3.16 percentage points increase over the random forest benchmark. Furthermore, the interpretability and detection rate of the method are also improved. 

Key words: Key words: encrypted traffic, malicious traffic, deep learning, contrastive learning, multi-scale features

中图分类号: