计算机与现代化 ›› 2021, Vol. 0 ›› Issue (02): 73-77.

• 人工智能 • 上一篇    下一篇

基于CycleGAN的非平行语音去噪方法

  

  1. (河海大学能源与电气学院,江苏南京211100)
  • 出版日期:2021-03-01 发布日期:2021-03-01
  • 作者简介:韩灿灿(1995—),女,安徽安庆人,硕士研究生,研究方向:机器学习与音频处理,E-mail: emma_han11@foxmail.com; 李志华(1964—),男,江苏泰州人,教授,硕士生导师,博士,研究方向:人工智能与复杂系统故障诊断; 徐睿(1995—),女,江苏盐城人,硕士研究生,研究方向:机器学习与音频处理。
  • 基金资助:
    江苏省自然科学基金资助项目(BK20151500)

Method of Nonparallel Speech Denoising Based on CycleGAN

  1. (College of Energy and Electrical Engineering, Hohai University, Nanjing 211100, China)
  • Online:2021-03-01 Published:2021-03-01

摘要: 针对语音去噪问题,提出一种基于循环生成对抗网络(CycleGAN)的方法来对声音场景中的语音进行去噪。该方法把CycleGAN的网络模型与不同领域间的语音转换技术进行结合与优化,通过提取语音频谱包络特征,对语音进行编码与解码的处理,旨在用先进的生成技术实现语音端到端的去噪,从而简化语音去噪过程中带来的高阶差异问题,同时泛化其应用场景。通过对非平行数据集和平行数据集进行训练与测试,主要比较该方法与传统CycleGAN的语音去噪方法下的去噪效果,由实验结果得到PESQ、NR、SSNR这3项指标分别相对提高了8.49%、6.53%、23.30%,有效地解决了实际场景中的非平行语音去噪问题。

关键词: 语音去噪, 循环生成对抗网络, 语音转换, 非平行数据集

Abstract: To solve the problem of speech denoising, a method based on cyclic generation adversarial network (CycleGAN) is proposed. This method combines and optimizes the network model of CycleGAN with the voice conversion technology in different fields, extracts the spectrum envelope features of speech, and then encodes and decodes the speech, aiming to achieve the end-to-end denoising of speech with advanced generation technology. Thus, the proposed algorithm simplifies the high-order difference problem in the process of speech denoising, and generalizes its application scenarios. By training and testing the nonparallel data set and parallel data set, the denoising effect of this method is mainly compared with that of the traditional CycleGAN method. The experimental results show that PESQ, NR and SSNR are improved by 8.49%, 6.53% and 23.30% respectively, which effectively solves the problem of nonparallel speech denoising in the actual scene.

Key words: speech denoising, CycleGAN, voice conversion, nonparallel data set