计算机与现代化

• 信息安全 • 上一篇    下一篇

基于膨胀系数的正则表达式分组算法

  

  1. (1.重庆大学计算机学院,重庆 400030; 2.重庆大学信息与网络管理中心,重庆 400030; 3.重庆大学软件学院,重庆 401331)
  • 收稿日期:2014-03-10 出版日期:2014-07-16 发布日期:2014-07-17
  • 作者简介:王美阳(1985-),男,江西上饶人,重庆大学计算机学院硕士研究生,研究方向:网络安全,下一代互联网; 唐学文(1955-),男,重庆大学信息与网络管理中心高级工程师,研究方向:网络技术与信息安全; 杨正益(1979-),男,重庆大学软件学院讲师,博士,研究方向:计算机软件与智能控制,程序性能优化。
  • 基金资助:
    国家自然科学基金资助项目(51105396)

Regular Expression Grouping Algorithm Based on Expansion Coefficient

  1. (1. School of Computer Science, Chongqing University, Chongqing 400030, China;
    2. Information and Campus Network Management Center, Chongqing University, Chongqing 400030, China;
    3. School of Software, Chongqing University, Chongqing 401331, China)
  • Received:2014-03-10 Online:2014-07-16 Published:2014-07-17

摘要: 确定性有限自动机(Det­erministic Finite Automata, DFA)匹配速度远快于非确定性有限状态自动机(Non-deterministic Finite state Autom­ata, NFA),但大量正则表达式转换为DFA时会引起状态爆炸而占用巨大的存储空间。首先定义膨胀系数(Expansion Coefficient, EC)来描述正则表达式的膨胀特性,然后在膨胀系数这一概念基础上,提出一种高效的分组算法——IGA(Improved Grouping Algorithm)算法对正则表达式进行有效分组,将容易引起状态爆炸的正则表达式相互隔离,从而节省存储空间。实验结果表明,与原有算法相比,在相同分组数目时IGA算法平均能够减少25%的状态数。

关键词: 膨胀系数, 正则表达式, 分组算法, DFA

Abstract: Matching speed of det­erministic finite automata (DFA) is far faster than that of the non-deterministic finite state autom­ata (NFA). However, when a large number of regular expressions transform in­to a DFA, the D­FA may consume huge storage space because of state explosion. Firstly, to describe characteristics of the regular expressions, expansion coefficient was defined. Based on the concept of expansion coefficient, a m­ore efficient grou­p­­­ing algorithm which named IGA algorithm was prop­o­sed. IGA algorithm i­solated those regular expressions which easily lead to state explosion, thus saved st­orage space. Experimental results show that, compared with the original algorithm, when the number of groups is same to the original one, IGA algorithm could reduce 25 percents of the number of states averagely.

Key words: expansion coefficient, regular expression, grouping algorithm, DFA

中图分类号: