计算机与现代化 ›› 2021, Vol. 0 ›› Issue (03): 77-81.

• 数据库与数据挖掘 • 上一篇    下一篇

基于CRF的交通肇事诉讼案件关键要素抽取

  

  1. (中北大学信息与通信工程学院,山西太原030051)
  • 出版日期:2020-03-30 发布日期:2021-03-24
  • 作者简介:郭凡莎(1994—),女,山西晋城人,硕士研究生,研究方向:自然语言处理,E-mail: 934816724@qq.com; 杨风暴(1968—),男,教授,博士,研究方向:自然语言处理,信息融合与处理,E-mail: yfengb@163.com。
  • 基金资助:
    山西省研究生创新项目(2020SY368); 山西省应用基础研究计划青年科技研究基金资助项目(201901D211233)

Extracting Key Elements of Traffic Accident Litigation Cases Based on CRF

  1. (School of Information and Communication Engineering, North University of China, Taiyuan 030051, China)
  • Online:2020-03-30 Published:2021-03-24

摘要: 针对案件判决相关人员办理诉讼案件案头杂、采集信息散、办案时间长等问题,提出一个基于条件随机场的交通肇事诉讼案件关键要素抽取模型。该模型借鉴信息抽取技术,通过构建关键要素标注集并建成语料库,设计不同的特征模板,充分结合交通肇事领域诉讼案件的文本特点,考虑窗口长度以及不同特征的选择和组合,基于PyCharm平台对交通肇事诉讼案件进行训练测试。实验结果表明,最优的特征模板能以80.15%的F1值抽取交通肇事诉讼案件中的关键要素,且不同的分词工具对关键要素识别结果有影响。并且提出的模型为快速正确地给出公平公正的裁判结果作了有效的探索和尝试。

关键词: 诉讼案件, CRF, 关键要素, 特征模板

Abstract: In order to solve the problems of the relevant personnel in the case, such as miscellaneous handling of litigation cases, scattered information collection, and long time for handling the case, a model for extracting key elements of traffic accident litigation cases is proposed based on Conditional Random Fields (CRF). The model uses information extraction technology to design different feature templates by constructing key element tagging set and building corpus. Fully combining the text characteristics of litigation cases in the field of traffic accidents, considering the window length and the selection and combination of different features, the traffic accident litigation case is trained and tested based on the PyCharm platform. The experimental results show that the optimal feature template can extract the key elements in traffic accident litigation cases with an F1 value of 80.15%, and different word segmentation tools have an impact on the key element identification results. The proposed model is an effective exploration and attempt to give a fair and just judgment result quickly and correctly.

Key words: litigation cases, CRF, key elements, feature templates