Computer and Modernization ›› 2016, Vol. 0 ›› Issue (4): 1-6.doi: 10.3969/j.issn.1006-2475.2016.04.001

    Next Articles

Structured Approach for Pathological Microscopy Text

  

  1. (1. School of Computer Science and Technology, Donghua University, Shanghai 201620, China;2. Computer Center of Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 201620, China)
  • Received:2015-11-09 Online:2016-04-14 Published:2018-09-30

Abstract:

The current structured approaches for the medical text data are mostly dependent on universal word segmentation software or professional terminology libraries, but the recognition effect of professional vocabularies by universal word segmentation tools is not satisfactory, and a mature system of Chinese standard terminology library is not established. Aimed at these problems, this paper puts forward a kind of structured processing method for medical text data based on statistical information. On the basis of clustering text and according to the breakpoint words and coincident string word segmentation, the key words and the type information of words are obtained by the statistical information of participle word string, enlarged the words and got the final lexicon as the word dictionary. It carried out word segmentation by the twoway dictionary word maximum matching algorithm and then obtained structured data by adding the rules of negative detection. Experiments show that the accuracy of the professional vocabulary libraries obtained by this method reached 80%, and this method achieves the capability to get structured data without the help of segmentation tools.

Key words: medical text data, structuring text data, statistics, word segmentation, two-way maximum matching

CLC Number: 

Copyright © Computer and Modernization, All Rights Reserved.
Tel: 0791-86490996 Fax: 0791-86492535 E-mail: jgsdd@163.com
Powered by Beijing Magtech Co., Ltd.