[1] 张家俊,宗成庆. 神经网络语言模型在统计机器翻译中的应用[J]. 情报工程, 2017,3(3):21-28.
[2] 刘洋. 神经机器翻译前沿进展[J]. 计算机研究与发展, 2017,54(6):1144-1149.
[3] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[C]// Advances in Neural Information Processing Systems 27 (NIPS 2014). 2014:3104-3112.
[4] BAHDANAU D, CHO K H, BENGIO Y. Neural Machine Translation by Jointly Learning to Align and Translate[J/OL]. (2014-12-19)[2018-12-10]. https://arxiv.org/pdf/1409.0473v4.pdf.
[5] KALCHBRENNER N, BLUNSOM P. Recurrent continuous translation models[C]// Proceedings of the 2013 ACL Conference on Empirical Methods in Natural Language Processing (EMNLP). 2013:1700-1709.
[6] CHO K H, VAN MERRIENBOER B, BAHDANAU D, et al. On the properties of neural machine translation: Encoder-Decoder approaches[C]// Proceedings of the 8th Workshop on Syntax, Semantics and Structure in Statistical Translation. 2014:103-111.
[7] DYER C, KUNCORO A, BALLESTEROS M, et al. Recurrent neural network grammars[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016:199-209.
[8] CHUNG J Y, GULCEHRE , CHO K H, et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling[J/OL]. (2014-12-11)[2018-12-10]. https://arxiv.org/pdf/1412.3555.pdf.
[9] GULCEHRE , FIRAT O, XU K, etal. On Using Monolingual Corpora in Neural Machine Translation[J/OL]. (2015-06-12)[2018-12-10]. https://arxiv.org/pdf/1503.03535.pdf.
[10]WU Y H, SCHUSTER M, CHEN Z F, et al. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation[J/OL]. (2016-09-26)[2018-12-10]. https://arxiv.org/pdf/1609.08144v1.pdf.
[11]PASCANU A, MIKOLOV T, BENGIO Y. On the Difficulty of Training Recurrent Neural Networks[J/OL]. (2013-02-16)[2018-12-10]. https://arxiv.org/pdf/1211.5063.pdf.
[12]HOCHREITER S, BENGIO Y, FRASCONI P, et al. Gradient flow in recurrent nets: The difficulty of learning long-term dependencies[M]// A Field Guide to Dynamical Recurrent Neural Networks. Wiley, 2001:237-243.
[13]〖JP+2〗HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997,9(8):1735-1780.
[14]HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:770-778.
[15]KIM Y, DENTON C, HOANG L, et al. Structured Attention Networks[J/OL]. (2017-02-16)[2018-12-10]. https://arxiv.org/pdf/1702.00887.pdf.
[16]LUONG M T, PHAM H, MANNING C D. Effective Approaches to Attention-based Neural Machine Translation[J/OL]. (2015-09-20)[2018-12-10]. https://arxiv.org/pdf/1508.04025.pdf.
[17]VASWANI A, SHAZEER N, PARMAR N, et al. Attention Is All You Need[J/OL]. (2017-06-30)[2018-12-10]. https://arxiv.org/pdf/1706.03762v4.pdf.
[18]BRITZ D, GOLDIE A, LUONG M T, et al. Massive Exploration of Neural Machine Translation Architectures[J/OL]. (2017-03-21)[2018-12-10]. https://arxiv.org/pdf/1703.03906.pdf.
[19]CHO K H, VAN MERRIENBOER B, GULCEHRE , et al. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J/OL]. (2014-09-03)[2018-12-10]. https://arxiv.org/pdf/1406.1078.pdf.
[20]KAISER L, BENGIO S. Can Active Memory Replace Attention?[J/OL]. (2016-10-27)[2018-12-10]. https://arxiv.org/pdf/1610.08613v1.pdf.
[21]BA J L, KIROS J R, HINTON G E. Layer Normalization[J/OL]. (2016-07-21)[2018-12-10]. https://arxiv.org/pdf/1607.06450.pdf.
[22]GEHRING J, AULI M, GRANGIER D, et al. Convolutional Sequence to Sequence Learning[J/OL]. (2017-05-12)[2018-12-10]. https://arxiv.org/pdf/1705.03122v2.pdf.
[23]PAPINENI K, ROUKOS S, WARD T, et al. BLEU: A method for automatic evaluation of machine translation[C]// Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 2002,7:311-318. |