Computer and Modernization

Previous Articles     Next Articles

Learning Semantically Related Words in Software Through Word Embedding

  

  1. School of Software, Shanghai Jiao Tong University, Shanghai 200240, China
  • Received:2016-12-01 Online:2017-09-20 Published:2017-09-19

Abstract: Searching for previously written code is important for software development and maintenance. The same as traditional information retrieval, the inherent difficulty of keyword based code search is vocabulary mismatch between user query and retrieved code. To improve the accuracy of code search, learning semantically related words in software for query expansion is needed. This paper designs a Word Embedding based method to learn semantically related words in software, and obtains semantically related words for 19332 words through training it on Stack Overflow documents. The experiment results show that the learned semantically related words can effectively improve code search accuracy.

Key words: code search, query expansion, semantically related words

CLC Number: