基于聚类的双说话人混合语音分离

计算机与现代化 ›› 2014, Vol. 0 ›› Issue (4): 86-88.

基于聚类的双说话人混合语音分离

广西大学计算机与电子信息学院，广西南宁530004

收稿日期:2014-01-20 出版日期:2014-04-17 发布日期:2014-04-23
作者简介:作者简介：吴春（1988），男，广西贺州人，广西大学计算机与电子信息学院硕士研究生，研究方向：计算机软件与理论。

Cochannel Speech Separation Based on Clustering

Computer and Electronic Information College, Guangxi University, Nanning 530004, China

Received:2014-01-20 Online:2014-04-17 Published:2014-04-23

摘要/Abstract

摘要：

摘要：针对许多基于训练模型的计算机听觉场景分析系统，在解决双说话人混合语音信号分离时需要依赖样本训练的有效性以及说话人的先验知识，提出一种基于聚类的单声道混合语音分离系统。系统先利用多基音跟踪算法对语音信号进行分析并产生同时流，然后通过最大化类内散布矩阵与类间散布矩阵的迹，搜索同时流的最佳分类，最终完成对双说话人的语音分离。该系统不需要训练语音模型，并且有效地改善了在双说话人混合语音信号的分离效果，为双说话人的语音分离提供了新的思路。



关键词: , 计算机听觉场景分析, 双说话人语音分离, 聚类

Abstract:

Abstract: This paper proposes an unsupervised clustering approach for cochannel speech separation to solve the problem that many auditory scene analysis (CASA) systems using training model to require the availability of pretrained speaker models and prior knowledge of participating speakers. The system produces simultaneous streams of mixture signal through multipitch tracking algorithm, and searches for the optimal assignment of simultaneous speech streams by maximizing the between and withincluster scatter matrix ratio to separate the mixtures. The system does not require trained speaker models, improves obviously the performance of cochannel separation, which offers a good solution to separate cochannel speech.



Key words:

, Key words: , CASA； cochannel speech separation； clustering

中图分类号:

TP391

吴春，梁正友. 基于聚类的双说话人混合语音分离[J]. 计算机与现代化, 2014, 0(4): 86-88.

WU Chun， LIANG Zhengyou. Cochannel Speech Separation Based on Clustering[J]. Computer and Modernization, 2014, 0(4): 86-88.

参考文献

［1］

Bregman A S. Auditory Scene Analysis: The Perceptual Organization of Sound［M］. MIT press, 1994.

［2］吴镇扬,张子喻,李想,等. 听觉场景分析的研究进展［J］. 电路与系统学报, 2001,6(2):6873.

［3］ Shao Y, Wang D L. Modelbased sequential organization in cochannel speech［J］. IEEE Transactions on Audio, Speech, and Language Processing, 2006,14(1):289298. 

［4］ Barker J, Coy A, Ma N, et al. Recent advances in speech fragment decoding techniques［C］//Proceedings of Interspeech. 2006:8588.

［5］ Hershey J R, Rennie S J, Olsen P A, et al. Superhuman multitalker speech recognition: A graphical modeling approach［J］. Computer Speech & Language, 2010,24(1):4566. 

［6］ Weiss R J, Ellis D P W. Speech separation using speakeradapted eigenvoice speech models［J］. Computer Speech & Language, 2010,24(1):1629.

［7］ Wang Deliang, Guy J Brown.Computational Auditory Scene Analysis: Principles, Algorithms, and Applications［M］. WileyIEEE Press, 2006.

［8］ Shao Y. Sequential Organization in Computational Auditory Scene Analysis［D］. The Ohio State University, 2007.

［9］ Jin Z, Wang D L. Reverberant speech segregation based on multipitch tracking and classification［J］. IEEE Transactions on Audio, Speech, and Language Processing, 2011,19(8):23282337.

［10］Narayanan A, Wang D L. Robust speech recognition from binary masks［J］. The Journal of the Acoustical Society of America, 2010,128(5):EL217EL222.

［11］Xu R, Wunsch D. Clustering［M］. Wiley Press, 2008.

［12］ Shukla Shubhendu S, Vijay J. Applicability of artificial intelligence in different fields of life［J］. International Journal of Scientific Engineering and Research, 2013,1(1):2835. 

［13］Cooke M, Lee T. Speech Separation Challenge［DB/OL］. http://staffwww.dcs.shef.ac.uk/people/M.Cooke/SpeechSeparationChallenge.htm, 20061111.

［14］Shao Y, Wang D L. Sequential organization of speech in computational auditory scene analysis［J］. Speech Communication, 2009,51(8):657667.

[1]	赵晨阳, 薛涛, 刘俊华. 基于改进Stable Diffusion的时尚服饰图案生成[J]. 计算机与现代化, 2024, 0(12): 15-23.
[2]	吕美静1, 年梅1, 张俊1, 2, 付鲁森1. 基于自编码器的网络流量异常检测[J]. 计算机与现代化, 2024, 0(12): 40-44.
[3]	万兵1, 2, 3, 赵文涛4, 潘多涛1, 赵峥韬2, 3, 孙朝阳2, 3, 俞建成2, 3. 无人帆船半物理仿真测试系统设计[J]. 计算机与现代化, 2024, 0(12): 91-99.
[4]	陈宇航1, 杨勇1, 帕力旦·吐尔逊1, 樊小超1, 任鸽1, 刁宇峰2. 融合句法特征与语义特征的作文自动评分方法[J]. 计算机与现代化, 2024, 0(11): 64-69.
[5]	刘文亮1, 吴飞1, 何德明1, 赵维伟2, 潘建宏3. 基于相异度矩阵的碎片化回复文本聚类方法[J]. 计算机与现代化, 2024, 0(09): 56-60.
[6]	黄文栋, 王怡凡. 基于模态类别的多模态信息处理与融合综述[J]. 计算机与现代化, 2024, 0(07): 47-62.
[7]	袁红伟1, 常利军1, 郝家欢2, 樊娜2, 王超2, 罗闯2, 张泽辉2. 基于标签传播的轨迹兴趣点挖掘及隐私保护[J]. 计算机与现代化, 2024, 0(05): 46-54.
[8]	敖博超, 范冰冰. 基于AP聚类算法的联邦学习聚合算法[J]. 计算机与现代化, 2024, 0(04): 5-11.
[9]	孟雅蕾1, 师红宇1, 王予2. 一种无阻流量预测方法[J]. 计算机与现代化, 2024, 0(04): 33-37.
[10]	曾钟静昕, 甘刚. 基于卷积自编码器的侧信道分析[J]. 计算机与现代化, 2024, 0(03): 110-114.
[11]	王秋忆, 周浩, 郑婷婷. 改进RetinaNet的电力设备目标检测方法[J]. 计算机与现代化, 2024, 0(01): 47-52.
[12]	王宏杰, 徐胜超. 基于希尔伯特相似度的云平台异常传输数据聚类方法[J]. 计算机与现代化, 2023, 0(09): 27-31.
[13]	韩雪. 基于约束聚类和粒子群算法的多路径规划[J]. 计算机与现代化, 2023, 0(08): 7-11.
[14]	孙子雨, 任燃, 魏曦哲. 基于DTW-TCN的股票分类及预测研究[J]. 计算机与现代化, 2023, 0(08): 31-37.
[15]	王艺成, 张国良, 张自杰, . 基于改进YOLOv5的小目标检测方法[J]. 计算机与现代化, 2023, 0(05): 100-105.