Phoenix's Blog

博客园 首页 新随笔 联系 订阅 管理

为了方便在MASM中使用,我用正则表达式从C++头文件中提取constant、function,callback等等,
在抓取这两段代码
#define NdrUnMarshConfStringHdr(p, s, l)    ((s=_midl_unma4(p,unsigned long),\
                                            (_midl_addp(p,4)),               \
                                            (l=_midl_unma4(p,unsigned long))


#define NdrMarshSCtxtHdl(pc,p,rd)   (NdrSContextMarshall((NDR_SCONTEXT)pc,p, (NDR_RUNDOWN)rd)

由于上两段代码括号并不平衡,所以在匹配时出现超时。

正则表达式由程序生成

(\#define\s+)
  (?'Name'([a-zA-Z_]\w+))
  (?'Params'(\([^\(\)\#]*\)))?
   \s+
  ((0(x|X)[0-9a-fA-F]+)|
  (\d+)|
  ([a-zA-Z_]\w+)|
  ("(?>[^"]+|"")*")|
  (\(([^\(\)\#]+|
  (?'Paren'\()|(?'Close-Paren'\)))*?(?(Paren)(?!))\))|
  (\+|\-|\*|/\\)|
  (([a-zA-Z_]\w+)(\(([^\(\)\#]+|(?'Paren'\()|(?'Close-Paren'\)))*?(?(Paren)(?!))\))))?

简化为这两种情况之后还是出现超时
(\#define\s+)
 (?'Name'([a-zA-Z_]\w+))
  (?'Params'(\([^\(\)\#]*\)))?
   \s+
  (\(([^\(\)\#]+|(?'Paren'\()|(?'Close-Paren'\)))*?(?(Paren)(?!))\))


(\#define\s+)
   (?'Name'([a-zA-Z_]\w+))
     (.+?)
     (\(([^\(\)\#]+|(?'Paren'\()|(?'Close-Paren'\)))*?(?(Paren)(?!))\))


一些匹配和文本导致超时的情况

^[a-zA-Z0-9]+((.?|\-*)[a-zA-Z0-9]+)*$
asdf.host-name.asd-f.

^(([a-zA-Z\d!#$%&'*+-/=?^_`{|}~]+\x20*|"((?=[\x01-\x7f])[^"\\]|\\[\x01-\x7f])*"\x20*)*(?<angle><))?
((?!\.)(\.?[a-zA-Z\d!#$%&'*+-/=?^_`{|}~]+)+|"((?=[\x01-\x7f])[^"\\]|\\[\x01-\x7f])*")
@
(((?!-)[a-zA-Z\d\-]+(?<!-)\.)+[a-zA-Z]{2,}
|\[
(((?(?<!\[)\.)(25[0-5]|2[0-4]\d|[01]?\d?\d)){4}
|[a-zA-Z\d\-]*[a-zA-Z\d]:((?=[\x01-\x7f])[^\\\[\]]|\\[\x01-\x7f])+)
\])
(?(angle)>)$
www42af43ds.afsd.fds.ds


^([0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*@(([0-9a-zA-Z])+([-\w]*[0-9a-zA-Z])*\.)+[a-zA-Z]{2,9})$
hello23423423423424n@aol.c

posted on 2005-07-27 10:14  Phoenix  阅读(1732)  评论(8编辑  收藏  举报