[1] CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. arXiv preprint arXiv:1406.1078, 2014.
[2] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014:3104-3112.
[3] CHO K, MONTREAL U D, BAHDANAU D, et al. On the properties of neural machine translation: Encoder-Decoder approaches[C]// Proceedings of the 8th Workshop on Syntax, Semantics and Structure in Statistical Translation. 2014:103-111.
[4] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv:1301.3781, 2013.
[5] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[J]. arXiv preprint arXiv:1310.4546, 2013.
[6] PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations[J]. arXiv preprint arXiv:1802.05365, 2018.
[7] RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving Language Understanding by Generative Pre-Training[R]. Technical Report, 2018.
[8] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:1810.04805, 2018.
[9] KOEHN P, KNOWLES R. Six challenges for neural machine translation[C]// Proceedings of the 1st Workshop on Neural Machine Translation. 2017:28-39.
[10]BROWN P F, COCKE J, DELLA PIETRA S A, et al. A statistical approach to machine translation[J]. Computational Linguistics, 1990,16(2):79-85.
[11]SENNRICH R, HADDOW B, BIRCH A. Improving neural machine translation models with monolingual data[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2015:86-96.
[12]CURREY A, BARONE A V M, HEAFIELD K. Copied monolingual data improves low-resource neural machine translation[C]// Proceedings of the 2nd Conference on Machine Translation. 2017:148-156.
[13]FADAEE M, BISAZZA A, MONZ C. Data augmentation for low-resource neural machine translation[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017:567-573.
[14]NGUYEN X P, OTY S, KUI W, et al. Data diversification: A simple strategy for neural machine translation[J]. arXiv preprint arXiv: 1911.01986, 2019.
[15]ZOPH B, YURET D, MAY J, et al. Transfer learning for low-resource neural machine translation[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016:1568-1575.
[16]NIU X, DENKOWSKI M, CARPUAT M. Bi-directional neural machine translation with synthetic parallel data[C]// Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. 2018:84-91.
[17]BAZIOTIS C, HADDOW B, BIRCH A. Language model prior for low-resource neural machine translation[J]. arXiv preprint arXiv:2004.14928, 2020.
[18]GULCEHRE C, FIRAT O, XU K, et al. On using monolingual corpora in neural machine translation[J]. arXiv preprint arXiv:1503.03535, 2015.
[19]BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translation[J]. arXiv preprint arXiv:1409.0473, 2016.
[20]HOCHREITERS, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997,9(8):1735-1780.
[21]BOJAR O, CHATTERJEE R, FEDERMANN C, et al. Findings of the 2016 conference on machine translation[C]// Proceedings of the 1st Conference on Machine Translation. 2016:131-198.
[22]SENNRICH R, HADDOW B, BIRCH A. Neural machine translation of rare words with subword units[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016:1715-1725.
[23]KINGMA D P, BAJ. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv:1412.6980, 2014.
[24]BOULANGER-LEWANDOWSKI N, BENGIO Y, VINCENT P. Audio chord recognition with recurrent neural networks[C// Proceedings of the 14th International Society for Music Information Retrieval Conference. 2013:335-340.
[25]GRAVES A. Sequence transduction with recurrent neural networks[J]. arXiv preprint arXiv:1211.3711, 2012.
[26]PAPINENI K, ROUKOS S, WARD T, et al. BLEU: A method for automatic evaluation of machine translation[C]// Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2002:311-318.
[27]VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st Conference on Neural Information Processing Systems. 2017:5998-6008.
|