Computer and Modernization

Previous Articles     Next Articles

 A Chinese Address Resolution Model Based on Trie Tree and Finite Automata

  

  1. 1.Wuhan Research Institute of Posts and Telecommunications, Wuhan 430074, China;
      2.FiberHome Communications Science & Technology Development Co. Ltd., Nanjing 210019, China
  • Received:2016-01-04 Online:2016-07-21 Published:2016-07-22

Abstract:  Until now, there is not a relatively mature model in the research of Chinese address resolution no matter in the academic or commercial fields. Elements identification is the main technique for address resolution. Traditional method of address elements identifying basing on the method of feature words and dictionary matching is difficult to solve the problem of the non-canonical address resolution. In this paper, the T-FA model is proposed to solve the problem of address segment and grading, for further, the Trie-tree model is adopted for addressing of administrative regions and the Finite-Automata(FA) model for the elements extraction of non-canonical address corresponding, which are both common technologies in natural language processing field. And fuzzy search and recognition of the address elements could be well resolved using words segmentation method based on the hidden Markov model and the Longest Common Sub-sequence(LCS) algorithm. The T-FA model achieves a better performance in the generalization ability for batch processing the address information than state-of-art, and more effective in solving the problem of non-canonical address resolution.

Key words:  natural language processing, address resolution, elements identify, Trie tree model, finite automata model