Computer and Modernization

    Next Articles

Semantic Retrieval Method for Water Conservancy Metadata Based on Hadoop

  

  1. College of Computer and Information, Hohai University, Nanjing 211100, China
  • Received:2015-08-11 Online:2015-12-23 Published:2015-12-30

Abstract: In order to provide a solution for the absence of semantic comprehension of metadata search engine in water conservancy domain together with the problem of low efficiency when indexing water conservancy metadata, a semantic retrieval method for water conservancy metadata based on Hadoop is brought forward in this paper. First, the semantic searching method with the combination of ontology and query expansion technology is used to design ontology reasoning rules, semantic similarity calculation method, expansion words selecting method and semantic relevance ordering method so as to effectively improve the recall ratio and precision ratio of search results. Second, as for the problem of low efficiency when building an index of water conservancy metadata in XML form, MapReduce parallel processing model in Hadoop platform is introduced to make parallel processing, analysis and extraction of metadata information and index building, and to modify the file structure of SequenceFile in response to the small files of water conservancy metadata and performance bottleneck of water conservancy metadata index building under centralized environment. Finally, semantic extension query method is designed by using of the powerful parallel computing capability of Hadoop so as to improve the query efficiency of water conservancy metadata.

Key words: Key words: domain ontology, calculation of similarity, semantic query, Hadoop, SequenceFile

CLC Number: