计算机与现代化 ›› 2011, Vol. 7 ›› Issue (7): 105-107,.doi:

• 算法分析与设计 • 上一篇    下一篇

正则表达式匹配引擎性能分析

邓凯元1,姜 磊2   

  1. 1.北京信息科技大学光电信息与通信工程学院,北京 100101; 2.中国科学院计算技术研究所,北京 100190
  • 收稿日期:2011-05-09 修回日期:1900-01-01 出版日期:2011-07-15 发布日期:2011-07-15

Performance Analysis of Different Regular Expressions Matching Engines

DENG Kai-yuan1, JIANG Lei2   

  1. 1.School of Photoelectric Information and Communication Engineering, Beijing Information Science and Technology University, Beijing 100101, China;2.Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2011-05-09 Revised:1900-01-01 Online:2011-07-15 Published:2011-07-15

摘要: 正则表达式具有强大的描述能力,在计算机领域,正则表达式匹配技术应用十分广泛。目前,已经有多个正则表达式匹配引擎,在实际应用中,对于不同的匹配规则集和正则语法,不同的匹配引擎会有不同的性能表现。本文通过对PCRE、Greta、Boost、RE2四种常用正则表达式匹配引擎的性能测试,给出在不用的正则语法情况下的匹配速度,并深入分析不同坏境下适用的正则表达式匹配引擎。对实际系统设计中正则表达式库的选择有指导意义。

关键词: 正则表达式, PCRE, 模式匹配, NFA, DFA

Abstract: With the remarkable descriptive power, regular expression matching is in wide use in the field of computer science. Dozens of regular expression matching engines has been developed. In the practical applications, for different rule sets and grammars, these engines exhibit different performance. By giving detailed performance experiments of four regular expression libraries, PCRE, Greta, Boost and RE2, this paper shows different matching speeds of these four engines. Further, it gives how to select proper regular expression engines in various environments. And the experiment results and analysis can serve as guideline in the future system design.

Key words: regular expression, PCRE, pattern matching, NFA, DFA