• 人工智能 • 下一篇
收稿日期:
2016-10-08
出版日期:
2017-06-23
发布日期:
2017-06-23
作者简介:
明芳(1991-),女,四川自贡人,北京交通大学计算机与信息技术学院硕士研究生,研究方向:自然语言处理,机器翻译; 徐金安(1970-),男,副教授,博士,研究方向:自然语言处理,机器翻译; 王楠(1992-),女,硕士研究生,研究方向:自然语言处理,机器翻译; 陈钰枫(1981-),女,副教授,博士,研究方向:自然语言处理,机器翻译; 张玉洁(1961-),女,教授,博士,研究方向:自然语言处理,机器翻译。
基金资助:
Received:
2016-10-08
Online:
2017-06-23
Published:
2017-06-23
摘要: 针对基于层次短语翻译模型的统计机器翻译使用上下文信息有限,时态翻译质量不高的问题,提出一种融合时态特征的日英统计机器翻译方法。该方法通过引入翻译规则的时态分类约束信息,解码器可以根据每条规则的潜在时态分类,为相应时态的句子匹配到最合适的规则进行翻译。首先从双语训练语料中抽取时态特征构建最大熵分类模型,然后再抽取包含各类时态信息的层次短语规则的时态特征,最后将规则的时态分类结果作为一类新特征,融入基于层次短语的翻译系统中。实验结果表明,与基线系统相比,该方法在多个测试集上提高了翻译质量,在一定程度上解决了日英层次短语模型的时态翻译问题。
中图分类号:
明 芳,徐金安,王 楠,陈钰枫,张玉洁. 融合时态特征的日英层次短语翻译模型[J]. 计算机与现代化, doi: 10.3969/j.issn.1006-2475.2017.06.001.
MING Fang, XU Jin-an, WANG Nan, CHEN Yu-feng, ZHANG Yu-jie. A Japanese-English Hierarchical Phrase-based Translation Model Integrating Tense Features[J]. Computer and Modernization, doi: 10.3969/j.issn.1006-2475.2017.06.001.
[1] Dorr B J. A parameterized approach to integrating aspect with lexical-semantics for machine translation[C]// Proceedings of the 30th Annual Meeting on Association for Computational Linguistics. 1992:257-264.
[2] Olsen M, Traum D, Van Ess-Dykema C, et al. Implicit Cues for Explicit Generation: Using Telicity as a Cue for Tense Structure in a Chinese to English MT System[DB/OL]. http://drum.lib.umd.edu/handle/1903/1134, 2001-05-10.
[3] Wang Chao, Seneff S. High-quality speech-to-speech translation for computer-aided language learning[J]. ACM Transactions on Speech and Language Processing, 2006,3(2):1-21.
[4] Murata M, Ma Qing, Uchimoto K, et al. An Example-based Approach to Japanese-to-English Translation of Tense, Aspect, and Modality[DB/OL]. https://arxiv.org/abs/cs/9912007, 1999-12-13.
[5] Murata M, Uchimoto K, Ma Qing, et al. Using a Support-vector Machine for Japanese-to-English Translation of Tense, Aspect, and Modality[DB/OL]. https://arxiv.org/abs/cs/0112003v1, 2001-12-05.
[6] Ye Yang, Li F V, Abney S. Latent features in automatic tense translation between Chinese and English[C]// Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. 2006:48-55.
[7] Liu Feifan, Liu Fei, Liu Yang. Learning from Chinese-English parallel data for Chinese tense prediction[C]// Proceedings of the 5th International Joint Conference on Natural Language Processing. 2011:1116-1124.
[8] Xue Nianwen, Zhang Yuchen. Buy one get one free: Distant annotation of Chinese tense, event type and modality[C]// Proceedings of the 9th International Conference on Language Resources and Evaluation. 2014:1412-1416.
[9] Zhang Yuchen, Xue Nianwen. Automatic inference of the tense of Chinese events using implicit linguistic information[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014:1902-1911.
[10] Ge Tao, Ji Heng, Chang Baobao, et al. One tense per scene: Predicting tense in Chinese conversations[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015:668-673.
[11] Tajiri T, Komachi M, Matsumoto Y. Tense and aspect error correction for ESL learners using global context[C]// Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. 2012,2:198-202.
[12] Gong Zhengxian, Zhang Min, Tan C, et al. N-gram-based tense models for statistical machine translation[C]// Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2012:276-285.
[13] Gong Zhengxian, Zhang Min, Tan C, et al. Classifier-based tense model for SMT[C]// Proceedings of the 24th International Conference on Computational Linguistics. 2012:411-420.
[14] 刘文照,[日]海老原博. 日本语初级语法[M]. 上海:华东理工大学出版社, 2009.
[15] Chiang D. A hierarchical phrase-based model for statistical machine translation[C]// Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. 2005:263-270.
[16] Och F J, Ney H. Discriminative training and maximum entropy models for statistical machine translation[C]// Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2002:295-302.
[17] Xiao Tong, Zhu Jingbo, Zhang Hao, et al. NiuTrans: An open source toolkit for phrase-based and syntax-based machine translation[C]// Proceedings of the ACL 2012 System Demonstrations. 2012:19-24.
[18] Och F J. Minimum error rate training in statistical machine translation[C]// Proceedings of the 41st Annual Meeting on Association for Computational Linguistics. 2003,1:160-167.
[19] koehn P. Statistical significance tests for machine translation evaluation[C]// Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. 2004:388-395. |
[1] | 付鸿林, 张太红, 杨雅婷, 艾孜麦提·艾瓦尼尔, 马 博. 基于生成对抗网络的维语场景文字修改网络[J]. 计算机与现代化, 2024, 0(01): 41-46. |
[2] | 王秋忆, 周 浩, 郑婷婷. 改进RetinaNet的电力设备目标检测方法[J]. 计算机与现代化, 2024, 0(01): 47-52. |
[3] | 林启钊, 彭志平, 郭 棉, 崔得龙. 基于双向多步预测的炉管温度场重构方法[J]. 计算机与现代化, 2024, 0(01): 53-58. |
[4] | 郑立瑞, 肖晓霞, 邹北骥, 刘 彬, 周 展. 基于BERT的电子病历命名实体识别[J]. 计算机与现代化, 2024, 0(01): 87-91. |
[5] | 李颖颖, 黄文培. 基于优化八叉树的场景视锥体裁剪算法[J]. 计算机与现代化, 2024, 0(01): 103-108. |
[6] | 夏千涵, 何胜煌, 吴元清, 赵乐乐. 基于可学习记忆特征金字塔网络的小样本目标检测[J]. 计算机与现代化, 2023, 0(12): 7-13. |
[7] | 周成诚, 曾庆军, 杨 康, 胡家铭, 韩春伟. 基于高效通道注意力模块的运动想象脑电识别[J]. 计算机与现代化, 2023, 0(12): 19-23. |
[8] | 曾伟平, 陈俊洪, Muhammad ASIM, 刘文印, 杨振国. 基于多阶段分形组合的点云补全算法[J]. 计算机与现代化, 2023, 0(12): 24-29. |
[9] | 白晓波, 江梦茜, 王铁山, 邵景峰, 李 勃, . 聚类质心与指数递减方法改进的哈里斯鹰算法[J]. 计算机与现代化, 2023, 0(12): 30-35. |
[10] | 邱凯星, 冯 广. 基于双重特征注意力的多标签图像分类模型[J]. 计算机与现代化, 2023, 0(12): 41-47. |
[11] | 杜 康, 郭鲁钰, 徐啟蕾, 单宝明, 张方坤. 基于模型种群分析变量选择的红外光谱建模方法[J]. 计算机与现代化, 2023, 0(12): 48-52. |
[12] | 刘语珵, 贺 奇, 董延华, 王晓宇. 结合时间相关度与课程搭配度的课程推荐方法[J]. 计算机与现代化, 2023, 0(12): 53-58. |
[13] | 张浩洋, 尹梓名, 乐珺怡, 沈达聪, 束翌俊, 杨自逸, 孔祥勇, 龚 伟. 3D-SPRNet: 一种基于并行解码器和双注意力机制的胆囊癌分割模型[J]. 计算机与现代化, 2023, 0(12): 59-66. |
[14] | 张伯泉, 麦海鹏, 陈嘉敏, 逄锦聚. 基于高灰度值注意力机制的脑白质高信号分割[J]. 计算机与现代化, 2023, 0(12): 67-75. |
[15] | 张在成, 李 健. 改进的神经渲染方法在建筑施工场景中的应用[J]. 计算机与现代化, 2023, 0(12): 76-81. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||