Speech Tracking Based on Cluster Analysis and Speaker Recognition

doi:10.3969/j.issn.1006-2475.2020.04.002

Abstract

Abstract: At present, the speech tracking quality will be seriously reduced under the condition of speaker interference, that is, mixed speech signals of multiple speakers in a speech segment. Aiming at this situation, a speech tracking algorithm based on cluster analysis and speaker recognition is proposed. Firstly, the improved clustering analysis method is used for speech separation. Specifically, it includes caching the center of mass and lowering the sampling rate in K-means clustering, and introducing regular terms into embedding feature space. Secondly, the GMM-UBM speaker model is used for speech tracking. The experimental results show that the improved cluster analysis method can effectively improve the real-time performance of the algorithm and the quality of speech separation, the GMM-UBM model has an 84% recognition rate in 3 s speech test.

Key words: single channel speech track, intelligent speech, clustering analysis, Gaussian mixture model, LSTM

CLC Number:

TP391.42

HAO Min, LIU Hang, LI Yang, JIAN Dan, WANG Jun-ying. Speech Tracking Based on Cluster Analysis and Speaker Recognition[J]. Computer and Modernization, doi: 10.3969/j.issn.1006-2475.2020.04.002.

References

［1］刘航. 基于LSTM与聚类分析的语音分离与跟踪算法研究［D］. 广州:广东工业大学, 2019.
［2］王方杰,金赟. 基于维纳滤波的数字助听器语音增强算法［J］. 电子器件, 2017,40(4):1021-1025.
［3］屈俊玲,李鸿燕. 基于计算听觉场景分析的混合语音信号分离算法研究［J］. 计算机应用研究, 2014,31(12):3822-3824.
［4］王义圆,张曦文,周贻能. 基于麦克风阵列的语音增强与干扰抑制算法［J］. 电声技术, 2018,396(2):4-8.
［5］ HOU J C, WANG S S, LAI Y H, et al. Audio-visual speech enhancement using multimodal deep convolutional neural networks［J］. Transactions on Emerging Topics in Computational Intelligence, 2018,2(2):117-128.
［6］ DELCROIX M, ZMOLIKOVA K, KINOSHITA K. Single channel target speaker target speaker extraction and recognition with speaker beam［C］// 2018 IEEE International Conference on Acoustics, Speech and Signal Processing. 2018:5554-5558.
［7］ DELCROIX M, KINOSHITA K, YU C. Context adaptive deep neural networks for fast acoustic model adaptation in noisy conditions［C］// 2016 IEEE International Conference on Acoustics, Speech and Signal Processing. 2016:5270-5274.
［8］郑方,李蓝天,张慧. 声纹识别技术及其应用现状［J］. 信息安全研究, 2016,2(1):44-57.
［9］张婷. 基于深度学习的有监督语音分离方法研究［D］. 济南：山东大学， 2018.
［10］董胡. 低信噪比环境下改进的语音端点检测算法［J］. 计算机技术与发展, 2016,26(3):71-74.
［11］黄建军,张雄伟,张亚非. 时频字典学习的单通道语音增强算法［J］. 声学学报, 2012,37(5):539-547.
［12］王燕南. 基于深度学习的说话人无关单通道语音分离［D］. 合肥:中国科学技术大学, 2017.
［13］VINCENT E, GRIBONVAL R, FEVOTTE C. Performance measurement in blind audio source separation［J］. IEEE Transactions on Audio, Speech, and Language Processing, 2006,14(4):1462-1469.
［14］HERSHEY J R, CHEN Z. Deep clustering: Discriminative embeddings for segmentation and separation［C］// 2016 IEEE International Conference on Acoustics, Speech and Signal Processing. 2016:31-35.
［15］AGNEW J, THORNTON J M. Just noticeable and objectionable group delays in digital hearing aids［J］. Journal of the American Academy of Audiology, 2000,11(6):330-336.
［16］李湾湾. 说话人声纹识别的算法研究［D］. 杭州:浙江大学, 2017.
［17］丁爱明. 基于MFCC和GMM的说话人识别系统研究［D］. 南京:河海大学, 2006.
［18］周国鑫,高勇. 基于GMM-UBM模型的说话人辨识研究［J］. 无线电工程, 2014,44(12):14-17.
［19］李慧慧. 基于深度学习的短语音说话人识别研究［D］. 郑州:郑州大学, 2016.
［20］SAON G, SOLTAU H, NAHAMOO D, et al. Speaker adaptation of neural network acoustic models using i-vectors［C］// 2013 IEEE Workshop on Automatic Speech Recognition and Understanding. 2013:55-59.
［21］杨瑞瑞. 基于文本无关的声纹识别算法的研究及实现［D］. 成都:电子科技大学, 2017.
［22］LUO Y, MESGARANI N. TasNet: Time-domain audio separation network for real-time, single-channel speech separation［C］// 2018 IEEE International Conference on Acoustics, Speech and Signal Processing. 2018:696-700.

[1]	ZHENG Li-rui, XIAO Xiao-xia, ZOU Bei-ji, LIU Bin, ZHOU Zhan. Named Entity Recognition in Electronic Medical Record Based on BERT [J]. Computer and Modernization, 2024, 0(01): 87-91.
[2]	FENG Xin-xin, BU Lei, ZHANG Xiao-yu, SHI Yu-feng. Analyzing to Shield Tunnel Segments Deformation Data Based on ICEEMDAN-LSTM [J]. Computer and Modernization, 2023, 0(11): 57-61.
[3]	PAN Si-yuan, ZHANG Wei. SOC Estimation of Lithium Battery Based on Improved LSTM [J]. Computer and Modernization, 2023, 0(08): 25-30.
[4]	ZHANG Zhi-xia, XIE Bao-qiang. Natural Gas Load Forecasting Based on FCGA-LSTM and Transfer Learning [J]. Computer and Modernization, 2023, 0(07): 7-12.
[5]	WEI Xin, HE Xiao-hai, TENG Qi-zhi, QING Lin-bo, CHEN Hong-gang. Event Extraction Method Based on BERT-BiLSTM-Attention Hybrid Model [J]. Computer and Modernization, 2023, 0(04): 26-31.
[6]	WANG Yi-ting, XING Ben-bei, LI Bin, LIU Ge, ZHANG Xiang-yu. Fuel Consumption Prediction Method of Heavy Trucks with Different Accelerating Driving Behaviors Based on Shared-LSTM [J]. Computer and Modernization, 2023, 0(03): 121-126.
[7]	ZHU Ya-jun, Yong Tso, Nyima Tashi, . Tibetan Medical Entity Recognition Based on Tibetan BERT [J]. Computer and Modernization, 2023, 0(01): 43-48.
[8]	ZHANG Zi-xuan, SHA Xiu-yan, XIAO Fei, SU Bao-chan, SUI Yu-lu, MENG Zi-chen. Research and Application of Hesitant Fuzzy Canopy-K-means Clustering Algorithm [J]. Computer and Modernization, 2022, 0(11): 17-21.
[9]	YIN Chun-jie, XIAO Fa-da, LI Peng-fei, ZHAO Qin. Short-term Load Forecasting of Regional Microgrid Based on LSTM Neural Network [J]. Computer and Modernization, 2022, 0(04): 7-11.
[10]	OUYANG Meng-ke, SHEN Wei-kang, CHENG Hui, SHI Kai. Short-term Load Forecasting Model Based on VMD and MOGOA-LSTM [J]. Computer and Modernization, 2022, 0(03): 7-12.
[11]	JI Ping, GUO Ying. Underwater Localization Algorithm of Range Correction Based on Long Short-Term Memory [J]. Computer and Modernization, 2022, 0(02): 52-57.
[12]	ZHAO Shu-jun, HUANG Qian. Research and Practice on Elastic Scaling of Cloud-Native 5G Network [J]. Computer and Modernization, 2021, 0(11): 28-38.
[13]	WANG Yun-qian, WANG Yi-song, CHEN Pan-feng, ZOU Long. Named Entity Recognition of Medicinal Plant Texts Integrated with Attention Mechanism [J]. Computer and Modernization, 2021, 0(11): 100-105.
[14]	ZHANG Cen-fang. Named Entity Recognition Algorithm Based on Active Learning [J]. Computer and Modernization, 2021, 0(07): 18-22.
[15]	LI Yan-sheng. Load Forecasting Method of Distribution Network Based on Neural Network [J]. Computer and Modernization, 2021, 0(06): 69-73.