计算机与现代化 ›› 2022, Vol. 0 ›› Issue (09): 119-126.

• 信息安全 • 上一篇    

融合注意力机制与并行混合网络的DGA域名检测

  

  1. (广东工业大学计算机学院,广东广州510006)
  • 出版日期:2022-09-22 发布日期:2022-09-22
  • 作者简介:刘立婷(1996—),女,广东阳江人,硕士研究生,研究方向:网络空间安全,E-mail: yuwei@qq.com; 欧毓毅(1974—),女, 广西合浦人,副教授,博士,研究方向:计算机网络系统集成,信息安全技术,E-mail: ouyuyi@gdut.edu.cn。
  • 基金资助:
    广州市科技计划项目(201902020007,202007010004)

DGA Domain Name Detection Combining Attention Mechanisms and Parallel Hybrid Network

  1. (School of Computers, Guangdong University of Technology, Guangzhou 510006, China)
  • Online:2022-09-22 Published:2022-09-22

摘要: 基于统计特征的DGA域名检测方法依赖复杂的特征工程,而现有端到端的深度学习方法在DGA域名家族的多分类任务中性能表现不佳。针对上述问题,提出一种融合注意力机制与并行混合网络的DGA域名检测方法。首先,引入深层金字塔卷积神经网络,提取域名深层语义信息,并使用通道注意力块SENet进行改进构建DPCNN-SE,自适应学习通道间关系,抑制无用特征的传递;同时,将自注意力机制与双向长短时记忆网络结合构建BiLSTM-SA网络,捕获域名数据中最具代表性的全局时序特征;最后,融合2个网络提取的特征,输入softmax层输出分类结果。实验结果表明,该方法在域名家族的多分类任务中相比CNN、LSTM的单一模型,F1值分别提高了10.30个百分点、10.18个百分点;相较于现有的混合网络方法Bilbo和BiGRU-MCNN,F1值分别提高了5.97个百分点、4.87个百分点,并且具有更低的计算复杂度。

关键词: DGA域名检测, 特征融合, 端到端, 长短记忆神经网络, 卷积神经网络

Abstract: Statistical feature-based DGA domain name detection methods relies on complex feature engineering, while the existing end-to-end deep learning methods perform poorly in the multi-classification tasks. To address these problems, a DGA domain name detection method combining attention mechanisms and parallel hybrid networks is proposed. Firstly, deep pyramid convolutional neural networks is introduced to extract deep semantic information of domain names, and DPCNN-SE is proposed by improving DPCNN using the channel attention block called SENet, which can learn inter-channel relationships adaptively and suppress the transmission of useless features. Meanwhile, the self-attention mechanism and the bidirectional long short-term memory network are combined to construct the BiLSTM-SA network to capture the most representative global temporal features in domain name data. Finally, the features extracted by the two networks are fused and fed into the softmax layer to output the classification results. The experimental results show that the method increases the F1-score by 10.30 percentage points and 10.18 percentage points in the multi-classification task of domain name family compared with the single model of CNN and LSTM, respectively; the F1-score increases by 5.97 percentage points and 4.87 percentage points, respectively, compared with the existing hybrid model method Bilbo and BiGRU-MCNN, and has lower computational complexity.

Key words: DGA domain name detection, feature fusion, end-to-end, long short-term memory neural network, convolutional neural network