Computer and Modernization ›› 2022, Vol. 0 ›› Issue (05): 10-15.

Previous Articles     Next Articles

 Traffic Accident Text Information Extraction Model Based on BERT and BiGRU-CRF Fusion

  

  1. (Institute of Information Engineering, Chang’an University, Xi’an710064, China)

  • Online:2022-06-08 Published:2022-06-08

Abstract: Aiming at existing traffic accident text data has difficulties in effectively extracting a large number of key heterogeneous data such as time, place and casualty loss, and the accuracy of traffic accident text information extraction methods based on static word vector deep learning model is low. The BERT (Bidirectional Encoder Representations from Transformers) is used for a dynamic vector mapping of the text characters in order to resolve the problem of ambiguity and context dependence insufficient from the source of data representation. The vectored features of text are extracted by using BiGRU(Bi-Gate Recurrent Unit) and text sequences with high features are output. Based on CRF (Conditional Random Fields), the probabilistic advantage of the global optimal output node is calculated to optimize the feature results of text sequence, and a BERT-BiGRU-CRF fusion model based on dynamic word vector is proposed forextracting the key information of traffic accident text. The comparison experiment shows that the average accuracy of the model in traffic accident text information extraction is 0.952 and F1 is 0.925, and 6.3 percentage points and 7.9 percentage points higher respectively than those of the model based on static word vector Word2Vec.

Key words: deep learning, text information extraction, heterogeneous information, BERT, BiGRU, CRF