计算机与现代化 ›› 2023, Vol. 0 ›› Issue (05): 52-57.

• 信息安全 • 上一篇    下一篇

基于辅助学习的改进端到端合成语音检测方法

  

  1. (河海大学能源与电气学院,江苏 南京 211100)
  • 出版日期:2023-06-06 发布日期:2023-06-06
  • 作者简介:袁甜甜(1998—),女,安徽淮南人,硕士研究生,研究方向:机器学习与音频处理,E-mail: yuantaurora@163.com; 通信作者:李志华(1964—),男,江苏泰州人,教授,硕士生导师,研究方向:人工智能与复杂系统故障诊断,E-mail: zhli@hhu.edu.cn; 邱阳(1997—),男,江苏丹阳人,硕士研究生,研究方向:机器学习与图像识别。
  • 基金资助:
    江苏省自然科学基金资助项目(BK20151500)

Improved End-to-end Synthetic Speech Detection Method Based on Auxiliary Learning

  1. (College of Energy and Electrical Engineering, Hohai University, Nanjing 211100, China)
  • Online:2023-06-06 Published:2023-06-06

摘要: 随着深度伪造技术的发展,合成语音检测面临越来越多的挑战。本文提出一种将辅助学习融入端到端模型的合成语音检测方法。将音频数据进行数据对齐后在不加提取任何手工特征的情况下直接输入到改进端到端模型,主任务进行真实语音与合成语音的二分类,同时选用不同合成语音类型判别作为辅助任务,为主任务的合成语音检测提供先验假设,并且对主辅任务的权重叠加进行了优化。 通过在公开数据集ASVspoof2019及ASVspoof2015上进行的实验结果表明,本文改进的模型与使用手工特征的模型相比能有效降低等错率,且优于改进前的端到端模型,并且在面对未知攻击类型时拥有更好的泛化能力。

关键词: 深度伪造, 合成语音检测, 辅助学习, 权重优化, 端到端系统

Abstract: With the development of deep forgery technology, synthetic speech detection faces more and more challenges, a synthetic speech detection method is proposed, which integrates auxiliary learning into end-to-end model. After data alignment, the audio data is directly input to the improved end-to-end model without extracting any manual features. The main task is to classify real speech and synthetic speech. At the same time, different synthetic speech types are selected as auxiliary tasks to provide a priori hypothesis for the combined speech detection of the main task, and the weight superposition of the main and auxiliary tasks is optimized. The experimental results on the open datasets ASVspoof2019 and ASVspoof2015 show that the improved model in this paper can effectively reduce the equal error rate compared with the model using manual features, and is better than the end-to-end model before the improvement, and has better generalization ability in the face of unknown attack types.

Key words: deep forgery, synthetic speech detection, auxiliary learning, weight optimization, end-to-end system