Computer and Modernization

Previous Articles     Next Articles

 Method of Web Information Extraction Based on Ontology Theory

  

  1. Shanghai Branch, Computer Network Emergency Response Technical Team/Coordination

     Center of China, Shanghai 201315, China
  • Received:2015-03-02 Online:2015-09-21 Published:2015-09-24

Abstract:

To get the Web information for a specific topic, it used an ontology method to measure the topic correlation, in order to improve the quality of Web information
extraction. According to Vector Space Model (VSM), by calculating weights of feature words, the ontology method to calculate topic correlation is used. In this way, it improved
the Web information extraction quality in specific topic. In this paper, the method not only simplified dimensional computing in VSM, but also extended the semantic range. A
practical application system with layered architecture was used to demonstrate the implementation process of this method. Practical application result shows that the proposed
method is more accurate in extract Web information on specific topic, at the same time it reduces the computational complexity of the system, while improving the web information
extraction recall and precision, thereby it reduces the missing pages of information, and improves the quality of Web information extraction.

Key words:  ontology, Web information extraction, correlation, crawler, Web