计算机与现代化

• 软件工程 • 上一篇    下一篇

基于变更相似性的跨语言克隆检测方法

  

  1. (1.上海交通大学软件学院,上海200240;2.上海交通大学计算机科学与工程系,上海200240)

  • 收稿日期:2015-11-06 出版日期:2016-04-14 发布日期:2018-09-30
  • 作者简介:柳萌宇(1991-),男,湖北黄冈人,上海交通大学软件学院硕士研究生,研究方向:程序分析与验证,克隆检测,程序同步; 钟浩,男,上海交通大学计算机科学与工程系副教授; 于海波,女,讲师。
  • 基金资助:

    国家自然科学基金资助项目(61572313)

Cross-language Clone Detection Based on Revision Similarity

  1. (1. School of Software, Shanghai Jiao Tong University, Shanghai 200240, China;

    2. Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China)

  • Received:2015-11-06 Online:2016-04-14 Published:2018-09-30

摘要:

为了吸引更多开发者或是支持不同的平台,开源项目组织或商业公司倾向于采用多种语言实现他们的项目。在这些多语言项目中,存在大量跨语言的克隆代码,跨语言的克隆检测成为维护此类项目的重要部分。但是,现有的工具大多只能检测同一种语言的克隆,无法有效检测如JavaC#之间存在的跨语言克隆。为此,本文提出一种基于代码变更相似性的克隆检测方法,该工具能够检测出JavaC#代码中存在的跨语言克隆。在开源项目ANTLRFpML上进行实验评估,结果表明该工具能够有效检测出跨语言克隆代码。

关键词:

text-indent: 21pt">mso-ascii-font-family: 'Times New Roman', mso-hansi-font-family: 'Times New Roman'">克隆检测, mso-ascii-font-family: 'Times New Roman', mso-hansi-font-family: 'Times New Roman'">信息检索, mso-ascii-font-family: 'Times New Roman', mso-hansi-font-family: 'Times New Roman'">数据挖掘, mso-ascii-font-family: 'Times New Roman', mso-hansi-font-family: 'Times New Roman'">变更相似性

Abstract:

To attract more developers or to support different platforms, open source organizations or business companies tend to re-implement their projects using different programming languages. In these multilanguage projects, it is difficult to avoid crosslanguage code clones. Crosslanguage clone detection becomes an important part of the maintenance. However, most tools can only detect clones in the same language and they cannot detect crosslanguage code clones between languages like Java and C#. In this paper, we propose a new approach based on revision similarity to detect crosslanguage clones on different platforms. The tool is able to find code clones between Java and C#. We evaluate our tool on two open source projects, ANTLR and FpML. Experiments show that our tool can identify crosslanguage code clones efficiently.

Key words: clone detection, information retrieval, data mining, revision similarity

中图分类号: