计算机与现代化

• 人工智能 • 上一篇    下一篇

基于Self-Attention模型的机器翻译系统

  

  1. (河海大学计算机与信息学院,江苏南京211100)
  • 收稿日期:2019-01-11 出版日期:2019-07-05 发布日期:2019-07-08
  • 作者简介:师岩(1993-),男,河北巨鹿人,硕士研究生,研究方向:自然语言处理,E-mail: yansirsy@qq.com; 王宇(1979-),男,研究员,博士,研究方向:云计算技术,E-mail: won9805@hhu.edu.cn; 吴水清(1994-),女,硕士研究生,研究方向:目标检测与识别,E-mail: wsq30332@163.com。
  • 基金资助:
    国家自然科学基金青年科学基金资助项目(61103017); 中国科学院感知中国先导专项子课题(XDA06040504)

Machine Translation System Based on Self-Attention Model

  1. (College of Computer and Information, Hohai University, Nanjing 211100, China)
  • Received:2019-01-11 Online:2019-07-05 Published:2019-07-08

摘要: 近几年来神经机器翻译(Neural Machine Translation, NMT)发展迅速,Seq2Seq框架的提出为机器翻译带来了很大的优势,可以在观测到整个输入句子后生成任意输出序列。但是该模型对于长距离信息的捕获能力仍有很大的局限,循环神经网络(RNN)、 LSTM网络都是为了改善这一问题提出的,但是效果并不明显。注意力机制的提出与运用则有效地弥补了该缺陷。Self-Attention模型就是在注意力机制的基础上提出的,本文使用Self-Attention为基础构建编码器-解码器框架。本文通过探讨以往的神经网络翻译模型,分析Self-Attention模型的机制与原理,通过TensorFlow深度学习框架对基于Self-Attention模型的翻译系统进行实现,在英文到中文的翻译实验中与以往的神经网络翻译模型进行对比,表明该模型取得了较好的翻译效果。

关键词: 神经机器翻译, Seq2Seq框架, 注意力机制, Self-Attention模型

Abstract: In recent years, neural machine translation (NMT) has developed rapidly. The proposed Seq2Seq framework brings great advantages to machine translation. It can generate arbitrary output sequences after observing the entire input sentence. However, this model still has great limitations on the ability to capture long-distance information. The proposed recurrent neural network (RNN) and LSTM network were all proposed to improve this problem, but the effect is not obvious. The presentation of the attention mechanism effectively compensates for this deficiency. The Self-Attention model is proposed on the basis of attention mechanism, and an encoder-decoder framework is built based on Self-Attention. This paper explores the previous neural network translation model. The mechanism and principle of the Self-Attention model are analyzed. The translation system is realized based on Self-Attention model by TensorFlow deep learning framework. In the English-to-Chinese translation experiment, compared with the previous neural network translation model, it shows that the model has a good translation effect.

Key words: neural machine translation, Seq2Seq, attention mechanism, Self-Attention model

中图分类号: