Computer and Modernization ›› 2022, Vol. 0 ›› Issue (03): 13-17.

Previous Articles     Next Articles

End to End Voiceprint Recognition Based on Nonlinear Stacked Bidirectional Network

  

  1. (1.School of Electronic Information, Xi’an Polytechnic University, Xi’an 710699, China;
    2.School of Marine Science and Technology, Northwestern Polytechnical University, Xi’an 710072, China)
  • Online:2022-04-29 Published:2022-04-29

Abstract: The traditional voiceprint recognition method is cumbersome and has a low recognition rate. The neural network used in the existing deep learning method is not specific to the speech signal, resulting in insufficient recognition accuracy. To solve the above problems, this paper proposes an end-to-end voiceprint recognition method based on nonlinear stacked bidirectional LSTM. Firstly, the Fbank features are extracted from the original voice files for the input of the network model. Then, in view of the continuous and strong relevance of the voice signal, a bidirectional long and short-term memory network is constructed to process the voice data to extract deep features. In order to further enhance the nonlinear expression ability of the network, stacking multi-layer bidirectional LSTM layer and multi-layer nonlinear layer are used to extract the deeper abstract features of the speech signal. Finally, the SGD optimizer is used to optimize the training mode. The experimental results show that the proposed method can make full use of the characteristics of the speech sequence signal and has strong time series comprehensiveness and nonlinear expression ability. The constructed model has strong integrity and better recognition effect than GRU and LSTM models.

Key words: voiceprint recognition, end to end, sequential characteristic, long short-term memory, stacked network, nonlinear