[1] ACHINTYA K S, ZHENG H T. Text dependent speaker verification using un-supervised HMM-UBM and temporal GMM-UBM[C]// Interspeech 2016. 2016:425-429.
[2] 郝敏,刘航,李扬,等. 基于聚类分析与说话人识别的语音跟踪[J]. 计算机与现代化, 2020(4):11-17.
[3] VARIANI E, LEI X, MCDERMOTT E, et al. Deep neural networks for small footprint text-dependent speaker verification[C]// 2014 IEEE International Conference on Acoustics, Speech and Signal Processing. 2014:4052-4056.
[4] TORFI A, DAWSON J, NASRABADI N M. Text-independent speaker verification using 3D convolutional neural networks[C]// 2018 IEEE International Conference on Multimedia and Expo. 2018:1-6.
[5] 项洋,殷锋,袁平. 基于X-Vector嵌入与BLSOM模型的声纹聚类系统[J]. 现代计算机, 2020,618(9):4-8.
[6] ROHDIN J, SILNOVA A, DIEZ M, et al. End-to-end DNN based speaker recognition inspired by I-vector and PLDA[C]// 2018 IEEE International Conference on Acoustics, Speech and Signal Processing. 2018:4874-4878.
[7] HEIGOLD G, MORENO I, BENGIO S, et al. End-to-end text-dependent speaker verification[C]// 2016 IEEE International Conference on Acoustics, Speech and Signal Processing. 2016:5115-5119.
[8] LI C, MA X K, JIANG B, et al. Deep speaker: An end-to-end neural speaker embedding system[J]. Computation and Language, arXiv preprint arXiv:1705.02304, 2017.
[9] SHON S, TANG H, GLASS J. Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model[C]// 2018 IEEE Spoken Language Technology Workshop. 2018:1007-1013.
[10]SHIN H C, ROTH H R, GAO M, et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning[J]. IEEE Transactions on Medical Imaging, 2016,35(5):1285-1298.
[11]WANG Y, SUN Y B, LIU Z W, et al. Dynamic graph CNN for learning on point clouds[J]. ACM Transactions on Graphics, 2019,38(5):1-12.
[12]LUO Y, CHEN Z, YOSHIOKA T. Dual-path RNN: Efficient long sequence modeling for time-domain single-channel speech separation[C]// 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. 2020:46-50.
[13]BASIRI M E, NEMATI S, ABDAR M, et al. ABCDM: An attention-based bidirectional CNN-RNN deep model for sentiment analysis[J]. Future Generation Computer Systems, 2021,115:279-294.
[14]HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997,9(8):1735-1780.
[15]YU Y, SI X S, HU C H, et al. A review of recurrent neural networks: LSTM cells and network architectures[J]. Neural Computation, 2019,31(7):1235-1270.
[16]EIMONEIM S A, NASSAR M A, DESSOUKY M I, et al. Text-independent speaker recognition using LSTM-RNN and speech enhancement[J]. Multimedia Tools and Applications, 2020,79(2):24013-24028.
[17]GRAVES A, SCHMIDHUBER J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J]. Neural Networks, 2005,18(5-6):602-610.
[18]YAN S Y, SMITH J S, LU W J, et al. CHAM: Action recognition using convolutional hierarchical attention model[C]// Proceedings of 2017 IEEE International Conference on Image Processing. 2017:3958-3962.
[19]李宏伟,吴庆祥. 智能传感器中神经网络激活函数的实现方案[J]. 传感器与微系统, 2014,33(1):51-53.
[20]BALDI P, PETER S D, LU Z. Learning in the machine: Random back propagation and the deep learning channel[C]// The 28th International Joint Conference on Artificial Intelligence. 2018:1-35.
[21]CHATTERJEE A, GUPTA U, CHINNAKOTLAM K, et al. Understanding emotions in text using deep learning and big data[J]. Computers in Human Behavior, 2019,93:309-317.
[22]NAIR V, HINTON G E. Rectified linear units improve restricted boltzmann machines vinod nair[C]// Proceedings of the 27th International Conference on Machine Learning. 2010:807-814.
[23]DOLEZEL P, SKRABANEK P, GAGO L. Weight initialization possibilities for feedforward neural network with linear saturated activation functions[J]. IFAC-PapersOnLine, 2016,49(25):49-54.
[24]MAAS A L, HANNUN A Y, NG A Y. Rectifier nonline arities improve neural network acoustic models[C]// Proceedings of the 30th International Conference on Machine Learning. 2013:456-462.
|