基于Gabor滤波的语音识别鲁棒性研究

doi:10.3969/j.issn.1006-2475.2018.05.005

计算机与现代化 ›› 2018, Vol. 0 ›› Issue (05): 20-.doi: 10.3969/j.issn.1006-2475.2018.05.005

基于Gabor滤波的语音识别鲁棒性研究

（1.兰州理工大学电气工程与信息工程学院,甘肃兰州730050；2.甘肃省工业过程先进控制重点实验室，甘肃兰州730050；
3.兰州理工大学电气与控制工程国家级实验教学示范中心，甘肃兰州730050）

收稿日期:2017-10-21 出版日期:2018-06-13 发布日期:2018-06-13
作者简介: 缑新科（1966-），男，甘肃天水人，兰州理工大学电气工程与信息工程学院、甘肃省工业过程先进控制重点实验室、兰州理工大学电气与控制工程国家级实验教学示范中心教授，博士，研究方向：模式识别，信号处理；徐高鹏(1991-),男，陕西榆林人，硕士研究生，研究方向：语音识别。

Research on Speech Recognition Robustness Based on Gabor Filtering

(1. College of Electrical and Information Engineering, Lanzhou University of Technology, Lanzhou 730050, China；
2. Key Laboratory of Gansu Advanced Control for Industrial Processes, Lanzhou 730050, China;
3. National Experimental Teaching Demonstration Center of Electrical and Control Engineering, Lanzhou University  of Technology, Lanzhou 730050, China)

Received:2017-10-21 Online:2018-06-13 Published:2018-06-13

摘要/Abstract

摘要： 为了提高语音识别系统的鲁棒性，提出一种基于GBFB（spectro-temporal Gabor filter bank）的声学特征提取方法，并通过分块PCA算法对高维的GBFB特征进行降维处理，最后在多个相同噪音环境对GBFB特征以及常用的GFCC，MFCC，LPCC等特征进行抗噪性能对比，与GFCC相比GBFB特征的识别率提高了5.35%，与MFCC特征相比提升了7.05%，比LPCC特征识别的基线低9个分贝。实验结果表明，在噪音环境下与传统的GFCC、MFCC以及LPCC等特征相比GBFB特征有更优越的鲁棒性。

关键词: 语音识别, 鲁棒性, Gabor滤波, 特征提取, GBFB特征

Abstract: In order to improve the robustness of speech recognition system, a method of extracting the acoustic features based on GBFB (spectro-temporal Gabor filter bank) is proposed, and the dimension of the GBFB is reduced by the block PCA algorithm. Finally, the feature of GBFB are compared with the feature of GFCC, MFCC and LPCC in different noise environments. The recognition rate of GBFB features is 5.35% better than GFCC features, the recognition rate of GBFB features is 7.05% better than MFCC features. Moreover， GBFB features are 9 dB lower than the LPCC recognition base. The experimental results show that the GBFB features exhibit better robustness than the traditional features of GFCC, MFCC and LPCC in the noisy environment.

Key words: speech recognition, robustness, Gabor filter, features extraction, GBFB features

中图分类号:

TN912.3

缑新科1，2，3,徐高鹏1，2，3. 基于Gabor滤波的语音识别鲁棒性研究[J]. 计算机与现代化, 2018, 0(05): 20-.

GOU Xin-ke1，2，3， XU Gao-peng1，2，3. Research on Speech Recognition Robustness Based on Gabor Filtering[J]. Computer and Modernization, 2018, 0(05): 20-.

参考文献

［1］罗仁泽,蒋涛,敬龙江,等. 一种低信噪比SMCC+系统快速同步算法［J］. 信号处理, 2005,21(3):236-239.
［2］刘伟伟. 基于GSV-SVM的语种识别关键技术研究与实现［D］. 郑州：解放军信息工程大学, 2012.
［3］赵彦平. 孤立词小词汇量抗噪声语音识别方法的研究［D］. 长春：吉林大学, 2006.
［4］ Glembek O, Burget L, Matejka P, et al. Simplification and optimization of i-vector extraction［J］. IEEE International Conference on Acoustics, 2011,125(3):4516-4519. 
［5］陈强. 基于GMM的说话人识别系统研究与实现［D］. 武汉：武汉理工大学, 2010.
［6］ Zbancioc M, Costin M. Using neural networks and LPCC to improve speech recognition［C］// International Symposium on Signals, Circuits and Systems(Vol 2). 2003:445.
［7］蒋文建,韦岗. 基于掩蔽的噪声环境下语音识别新特征［J］. 声学学报, 2001(6):516-520.
［8］ Islam M A. GFCC-based robust gender detection［C］// IEEE International Conference on Innovations in Science, Engineering and Technology. 2017:1-4.
［9］王让定,柴佩琪. 语音倒谱特征的研究［J］. 计算机工程, 2003,29(13):31-33.
［10］曹丽. 基于Gabor滤波器的人脸特征提取算法研究［D］. 沈阳：东北大学, 2008.
［11］孙晓兵,保铮. 分数阶Fourier变换及其应用［J］. 电子学报, 1996(12):60-65.
［12］Pei Soo-chang, Ding Jian-jiun, Chang Ja-han. Efficient implementation of quaternion Fourier transform, convolution, and correlation by 2-D complex FFT［J］. IEEE Transactions on Signal Processing, 2001,49(11):2783-2797.
［13］Roweis S. EM algorithms for PCA and SPCA［C］// Proceedings of 1997 Conference on Advances in Neural Information Processing Systems. 1997:626-632.
［14］林海波,王可佳. 一种新的听觉特征提取算法研究［J］. 南京邮电大学学报(自然科学版), 2017,37(2):27-32.
［15］黄玲,李琳,王薇，等. 基于Sparse K-SVD学习字典的语音增强方法［J］. 厦门大学学报(自然科学版), 2014,53(1):36-40.
［16］Tokuda K, Masuko T, Miyazaki N, et al. Multi-space probability distribution HMM［J］. Ieice Transactions on Information & Systems, 2002,85(3):455-464.
［17］Mathew L R, Anselam A S, Pillai S S. Analysis of LD-CELP coder output with Sound eXchange and Praat software［C］// IEEE International Conference on Advanced Communication Control and Computing Technologies. 2015:1281-1285.

[1]	刘云海1, 冯广1, 吴晓婷2, 杨群2. 复杂施工场景下的安全帽佩戴检测算法[J]. 计算机与现代化, 2024, 0(12): 66-71.
[2]	余晨曦, 谷林. 基于人体骨架的电梯内异常行为识别预警[J]. 计算机与现代化, 2024, 0(09): 114-120.
[3]	何若男1, 范翔2, 陈益1, 姜羽菲1, 曹辉1. 比例优势逻辑回归优化嗓音障碍指数算法[J]. 计算机与现代化, 2024, 0(08): 1-4.
[4]	岳有军1, 2, 张远锟1, 赵辉1, 2, 王红君1, 2. 基于多尺度特征与注意力模块的室内场景识别方法[J]. 计算机与现代化, 2024, 0(08): 37-42.
[5]	赵小明, 潘婷, 刘伟锋. 基于图像分类的自动绘画心理分析方法[J]. 计算机与现代化, 2024, 0(08): 92-97.
[6]	武丽1, 张征浩2, 葛彩成2, 俞俊2. 基于改进SCNN网络的车道线检测算法[J]. 计算机与现代化, 2024, 0(07): 87-92.
[7]	周超, 丛鑫, 訾玲玲, 肖谷平. 基于DNN与注意力机制的推荐算法模型[J]. 计算机与现代化, 2024, 0(06): 1-7.
[8]	刘力霈, 杨晓利, 李振伟. 基于边中心网络特征提取的癫痫脑电分类研究[J]. 计算机与现代化, 2024, 0(05): 22-26.
[9]	袁世一. 基于经验模态分解与极限学习机的粮食产量模型预测[J]. 计算机与现代化, 2024, 0(03): 47-53.
[10]	王秋忆, 周浩, 郑婷婷. 改进RetinaNet的电力设备目标检测方法[J]. 计算机与现代化, 2024, 0(01): 47-52.
[11]	杨柳青, 王冲. 基于极大熵的Web服务资源个性化推荐方法[J]. 计算机与现代化, 2023, 0(09): 32-37.
[12]	胡睿杰, 车逗. 红外小目标检测方法综述[J]. 计算机与现代化, 2023, 0(08): 79-86.
[13]	王欣怡, 尹四清, 洪军. 融合注意力机制的非对称深度监督哈希[J]. 计算机与现代化, 2023, 0(05): 26-31.
[14]	潘凤, 王杰, 张艳莎, 谭棉, 何兴, 王林, . 基于双分支特征拼接的行人重识别[J]. 计算机与现代化, 2023, 0(05): 93-99.
[15]	钱晓钊, 王澎. 面向图卷积神经网络鲁棒防御方法[J]. 计算机与现代化, 2023, 0(01): 74-80.

基于Gabor滤波的语音识别鲁棒性研究

Research on Speech Recognition Robustness Based on Gabor Filtering

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价