ABAP学习(22):正则表达式使用
ABAP 正则表达式
ABAP支持正则表达式。
支持正则表达式的语句:
1.FIND,REPLACE语句;
2.Functions:count,count_xxx,contains,find,find_xxx,match,matches,replace,substring,substring_xxx;
3.类:CL_ABAP_REGEX,CL_ABAP_MATCHER;
1. 正则表达式语句规则
1.1 Single Character Patterns
单个普通字符:A-B,0-9等单个字符,以及一些特殊字符通过反斜杠(\)转义变成普通字符;
特殊字符:. , [,],-,^这些字符作为特殊操作符,-,^只有在[]中有特殊意义;
示例:
"1.Single Character Patterns "示例: "regex:A string:a 结果:不匹配 "regex:AB string:A 结果:不匹配 IF cl_abap_matcher=>matches( pattern = 'A' text = 'A' ) = abap_true. WRITE:/ '1.true'. ENDIF. ".,[,],-,^特殊操作字符 ".可以替换任意单个字符; "\使用反斜杠将特殊字符变成普通字符; "\和一些字符一起表示一组字符(不能再[]中使用): "1.\C:表示字母字符集; "2.\d:表示数字字符集; "3.\D:表示非数字字符集; "4.\l:表示小写字符集; "5.\L:表示非小写字符集; "6.\s:表示空白字符; "7.\S:表示非空白字符; "8.\u:表示大写字符集; "9.\U:表示非大写字符集; "10.\w:表示字母数字下划线字符集; "11.\W:表示非字母数字下划线字符集; "[]表示一个字符集,只需要匹配字符集中一个字符,表示匹配; "[^x]表示对该字符集取反,只需要不匹配字符集中任意字符,表示匹配; "[x-x]表示字符集范围,A-Z,a-z,0-1等; "ABAP定义的字符集 "1.[:alnum:]字母数字集; "2.[:alpha:]字母集; "3.[:digit:]数字集; "4.[:blank:]空白字符,水平制表符; "5.[:cntrl:]所有控制字符集; "6.[:graph:]可显示字符集,除空白和水平制表符; "7.[:lower:]小写字符集; "8.[:print:]所有可显示字符的集合([:graph:]和[:blank:]的并集); "9.[:punct:]所有标点字符集; "10.[:space:]所有空白字符、制表符和回车符的集合; "11.[:unicode:]字符表示大于255的所有字符集(仅在Unicode系统中); "12.[:upper:]所有大写字符集; "13.[:word:]包括下划线在内的所有字母数字字符集_; "14.[:xdigit:]所有十六进制数字的集合(“0”-“9”,“A”-“F”,和“A”-“F”); "示例: "regex:\. string:. 结果:匹配 "regex:\C string:A 结果:匹配 "regex:.. string:AB 结果:匹配 "regex:[ABC] string:A 结果:匹配 "regex:[AB][CD] string:AD 结果:匹配 "regex:[^A-Z] string:1 结果:匹配 "regex:[A-Z-] string:- 结果:匹配 IF cl_abap_matcher=>matches( pattern = '[A-Z-]' text = 'A' ) = abap_true. WRITE:/ '2.true'. ENDIF.
1.2 Character string patterns
多正则表达式连接匹配。
特殊字符{,},*,+,?,(,),|,\
示例:
"2.Character string patterns "示例: "regex:h[ae]llo string:hello 结果:匹配; "regex:h[ae]llo string:hallo 结果:匹配; IF cl_abap_matcher=>matches( pattern = '[A-Z-]' text = 'A' ) = abap_true. WRITE:/ '3.true'. ENDIF. "{,},*,+,?,(,),|,\特殊字符 "x{n}:表示修饰的字符出现n次; "x{n,m}:表示修饰字符出现n~m次; "x*:表示修饰字符出现{0,}次; "x+:表示修饰字符出现{1,}次; "x?:表示修饰字符出现{0,1}次; "a|b:表示匹配a或b字符; "():表示分组匹配 "(?:xxx):表示xxx出现一次 "使用\1,\2代表分组从左到右 "\Qxxx\E之间的特殊字符变成普通字符 "示例: "regex:hi{2}o string:hiio 结果:匹配 "regex:hi{1,3}o string:hiiio 结果:匹配 "regex:hi?o string:ho 结果:匹配 "regex:hi*o string:ho 结果:匹配 "regex:hi+o string:hio 结果:匹配 "regex:.{0,4} string:匹配0~4个字符 "regex:a|bb|c string:bb 结果:匹配 "regex:h(a|b)o string:hao 结果:匹配 "regex:(a|b)(?:ac) string:bac 结果:匹配 "regex:(").*\1 string:"hi" 结果:匹配 IF cl_abap_matcher=>matches( pattern = '(a|b)(?:ac)' text = 'bac' ) = abap_true. WRITE:/ '4.true'. ENDIF. IF cl_abap_matcher=>matches( pattern = '(").*\1' text = '"hi"' ) = abap_true. WRITE:/ '5.true'. ENDIF. DATA:TEXT type STRING. DATA:result_tab TYPE match_result_tab. DATA:wa_result_tab TYPE match_result. text = 'aaaaaabaaaaaaacaaaa'. FIND ALL OCCURRENCES OF REGEX '(a+)(a)' IN text RESULTS result_tab. WRITE:/ text. LOOP AT result_tab INTO wa_result_tab. WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length. ENDLOOP.
1.3 Search Pattern
开始结尾字符匹配
示例:
"3.Search Pattern "特殊字符:^,$,\,(,),=,! "示例1:Start and end of a line "^,$表示前置符号,结尾符号,每一行 text = |Line1\nLine2\nLine3|. FIND ALL OCCURRENCES OF REGEX '^' IN text RESULTS result_tab. WRITE:/ text. LOOP AT result_tab INTO wa_result_tab. WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length. ENDLOOP. FIND ALL OCCURRENCES OF REGEX '$' IN text RESULTS result_tab. WRITE:/ text. LOOP AT result_tab INTO wa_result_tab. WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length. ENDLOOP. "示例2:Start and end of a character string "\A,\z作为前置符号,结尾符号,字符串开始结尾 DATA:t_text(10) TYPE c. DATA:t_text_tab LIKE TABLE OF text. APPEND ' Smile' TO t_text_tab. APPEND ' Smile' TO t_text_tab. APPEND ' Smile' TO t_text_tab. APPEND ' Smile' TO t_text_tab. APPEND ' Smile' TO t_text_tab. APPEND ' Smile' TO t_text_tab. FIND ALL OCCURRENCES OF regex '\A(?:Smile)|(?:Smile)\z' IN TABLE t_text_tab RESULTS result_tab. WRITE:/ 'Smile匹配'. LOOP AT result_tab INTO wa_result_tab. WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length. ENDLOOP. "示例3 "\z匹配最后行,\Z忽略换行匹配最后字符 text = |... this is the end\n\n\n|. FIND REGEX 'end\z' IN text. IF sy-subrc <> 0. WRITE / `There's no end.`. ENDIF. FIND REGEX 'end\Z' IN text. IF sy-subrc = 0. WRITE / `The end is near the end.`. ENDIF. "示例4:Start and End of Word "\<,\>也表示匹配开头,结尾单词 "\b表示开头结尾匹配 "查找s开头 text = `Sometimes snow seems so soft.`. FIND ALL OCCURRENCES OF regex '\<s' IN text IGNORING CASE RESULTS result_tab. WRITE:/ 's开头',text. LOOP AT result_tab INTO wa_result_tab. WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length. ENDLOOP. FIND ALL OCCURRENCES OF regex 's\b' IN text IGNORING CASE RESULTS result_tab. WRITE:/ 's开头或结尾',text. LOOP AT result_tab INTO wa_result_tab. WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length. ENDLOOP. "示例5:Preview Condition "预定义匹配内容不作为匹配结果内容 "(?=x),相当于匹配x "(?!x),相当于不匹配x text = `Shalalala!`. FIND ALL OCCURRENCES OF REGEX '(?:la)(?=!)' IN text RESULTS result_tab. WRITE:/ text. "这里匹配到最后'la','!'不作为匹配到内容 LOOP AT result_tab INTO wa_result_tab. WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length. ENDLOOP. "示例6:Cut operator DATA:s_text TYPE string. DATA:moff TYPE i. DATA:mlen TYPE i. s_text = `xxaabbaaaaxx`. FIND REGEX 'a+b+|[ab]+' IN text MATCH OFFSET moff MATCH LENGTH mlen. WRITE:/ s_text. IF sy-subrc = 0. WRITE:/ moff. WRITE:/ mlen. WRITE:/ text+moff(mlen). ENDIF. FIND REGEX '(?>a+b+|[ab]+)' IN text MATCH OFFSET moff MATCH LENGTH mlen. WRITE:/ s_text. IF sy-subrc = 0. WRITE:/ moff. WRITE:/ mlen. WRITE:/ text+moff(mlen). ENDIF. FIND REGEX '(?>a+|a)a' IN text MATCH OFFSET moff MATCH LENGTH mlen. WRITE:/ s_text. IF sy-subrc <> 0. WRITE:/ moff. WRITE:/ mlen. WRITE:/ 'Nothing found'. ENDIF.
1.4 Replace Patterns
替换字符REPLACE
示例:
"4.Replace Patterns "REPLACE关键词替换字符 "特殊字符:$,&,`,` "示例1:Addressing the Full Occurrence text = `Yeah!`. REPLACE REGEX `\w+` IN text WITH `$0,$&`. WRITE:/ text. "示例2:Addressing the Registers of Subgroups "自身分组替换,返回`CBA'n'ABC` text = `ABC'n'CBA`. REPLACE REGEX `(\w+)(\W\w\W)(\w+)` IN text WITH `$3$2$1`. WRITE:/ text. "示例3:Addressing the Text Before the Occurrence text = `ABC and BCD`. REPLACE REGEX 'and' IN text WITH '$0 $`'. "ABC and ABC BCD WRITE:/ text.
1.5 Simplified Regular Expressions
简化正则表达式
示例:
"5.Simplified Regular Expressions "这个类CL_ABAP_REGEX,仅支持简化正则表达式 "不支持+,|,(?=),(?!),(?:); "{} => \{\} "() => \(\) "示例1 DATA:lo_regex TYPE REF TO cl_abap_regex. DATA:t_res TYPE match_result_tab. DATA:wa_res TYPE match_result. "不使用simplified Regular,+表示前面字符出现{1,} CREATE OBJECT lo_regex EXPORTING pattern = 'a+' ignore_case = abap_true "忽略大小写 simple_regex = abap_false. FIND ALL OCCURRENCES OF REGEX lo_regex IN 'aaa+bbb' RESULTS t_res. LOOP AT t_res INTO wa_res. WRITE:/ wa_res-line,wa_res-offset,wa_res-length. ENDLOOP. "使用simplified Regular,+表示普通+ CREATE OBJECT lo_regex EXPORTING pattern = 'a+' simple_regex = abap_true. FIND ALL OCCURRENCES OF REGEX lo_regex IN 'aaa+bbb' RESULTS t_res. LOOP AT t_res INTO wa_res. WRITE:/ wa_res-line,wa_res-offset,wa_res-length. ENDLOOP.
1.6 Special Characters in Regular Expressions
示例:
"6.Special Characters in Regular Expressions "正则表达式中特殊表达式 "\ Escape character for special characters "反斜杠转义字符 "$0, $& Placeholder for the whole found location "$1, $2, $3... Placeholder for the registration of subgroups "$` Placeholder for the text before the found location "$' Placeholder for the text after the found location
2. 正则表达式使用
2.1FIND,REPLACE关键词
示例:
"使用FIND,REPLACE关键词 "FIND "语法:FIND [{FIRST OCCURRENCE}|{ALL OCCURRENCES} OF] pattern " IN [section_of] dobj " [IN {CHARACTER|BYTE} MODE] " [find_options]. "pattern = {[SUBSTRING] substring} | {REGEX regex} "可以查找substring或匹配regex "section_of = SECTION [OFFSET off] [LENGTH len] OF "可以指定查找dobj字符串匹配范围,off匹配开始位置,len偏移长度 "find_options = [{RESPECTING|IGNORING} CASE] " [MATCH COUNT mcnt] " { {[MATCH OFFSET moff] " [MATCH LENGTH mlen]} " | [RESULTS result_tab|result_wa] } " [SUBMATCHES s1 s2 ...] "mcnt:匹配次数,如果first occurrence,mcnt一直为1 "moff:最后一次匹配偏移值,如果是first occurrence,则是第一次匹配值 "mlen:最后一次匹配字符串长度,如果是first occurence,则是第一次匹配值 "submatches:分组匹配字符串 "示例1 DATA:s1 TYPE string. DATA:s2 TYPE string. text = `Hey hey, my my, Rock and roll can never die`. FIND REGEX `(\w+)\W+\1\W+(\w+)\W+\2` IN text IGNORING CASE MATCH OFFSET moff MATCH LENGTH mlen SUBMATCHES s1 s2. WRITE:/ moff,mlen,s1,s2. "REPLACE "语法: "1. REPLACE [{FIRST OCCURRENCE}|{ALL OCCURRENCES} OF] pattern " IN [section_of] dobj WITH new " [IN {CHARACTER|BYTE} MODE] " [replace_options]. "replace_options = [{RESPECTING|IGNORING} CASE] " [REPLACEMENT COUNT rcnt] " {{[REPLACEMENT OFFSET roff][REPLACEMENT LENGTH rlen]} " |[RESULTS result_tab|result_wa]} "2. REPLACE SECTION [OFFSET off] [LENGTH len] OF dobj WITH new " [IN {CHARACTER|BYTE} MODE]. text = 'hello1 world!22'. REPLACE ALL OCCURRENCES OF REGEX '[0-9]' IN SECTION OFFSET 0 LENGTH 10 OF text WITH '!'. WRITE:/ text. "指定位置范围替换 REPLACE SECTION OFFSET 10 LENGTH 5 OF text WITH '!'. WRITE:/ text.
2.2使用function
可以使用到正则表达式的function:find,count,match等方法。
示例:
"使用function "find "返回匹配字符位置 "语法: "1.find( val = text {sub = substring}|{regex = regex}[case = case][off = off] [len = len] [occ = occ] ) "2.find_end( val = text regex = regex [case = case][off = off] [len = len] [occ = occ] ) "3.find_any_of( val = text sub = substring [off = off] [len = len] [occ = occ] ) "4.find_any_not_of( val = text sub = substring [off = off] [len = len] [occ = occ] ) "occ表是返回第几次匹配值,如果为正从左到右匹配,如果为负从右到左匹配 "示例 DATA:mocc TYPE I VALUE 1. DATA:result TYPE I. text = 'hello world world'. result = find( val = text sub = 'wo' case = abap_true off = moff len = mlen occ = mocc ). WRITE:/ text,result,moff,mlen,mocc. "count "返回匹配次数 "语法: "1.count( val = text {sub = substring}|{regex = regex} [case = case][off = off] [len = len] ) "2.count_any_of( val = text sub = substring [off = off] [len = len] ) "3.count_any_not_of( val = text sub = substring [off = off] [len = len] ) result = count( val = text sub = 'wo' case = abap_true off = moff len = mlen ). WRITE:/ text,result,moff,mlen. "match "返回匹配结果子串 "语法: "match( val = text regex = regex [case = case] [occ = occ] ) DATA:s_result TYPE string. s_result = match( val = text regex = 'wor' case = abap_true occ = 1 ). WRITE:/ s_result. "contains "返回字符串是否包含子串,boolean "1.contains( val = text sub|start|end = substring [case = case][off = off] [len = len] [occ = occ] ) "2.contains( val = text regex = regex [case = case][off = off] [len = len] [occ = occ] ) "3.contains_any_of( val = text sub|start|end = substring [off = off] [len = len] [occ = occ] ) "4.contains_any_not_of( val = text sub|start|end = substring [off = off] [len = len] [occ = occ] ) "off:匹配开始位置 "len:从开始偏移量 "occ:指定匹配次数,如果匹配字符串没有出现大于等于指定次数,返回false "case:大小写敏感 text = 'abcdef egg help'. IF contains( val = text sub = 'e' case = abap_true off = 0 len = 15 occ = 2 ). WRITE:/ 'contains:匹配成功'. ENDIF. "matches "返回字符串匹配结果,boolean "语法:matches( val = text regex = regex [case = case] [off = off] [len = len] ) ... "示例: text = '33340@334.com'. "匹配邮箱 IF matches( val = text regex = `\w+(\.\w+)*@(\w+\.)+((\l|\u){2,4})` ). MESSAGE 'Format OK' TYPE 'S'. ELSEIF matches( val = text regex = `[[:alnum:],!#\$%&'\*\+/=\?\^_``\{\|}~-]+` & `(\.[[:alnum:],!#\$%&'\*\+/=\?\^_``\{\|}~-]+)*` & `@[[:alnum:]-]+(\.[[:alnum:]-]+)*` & `\.([[:alpha:]]{2,})` ). MESSAGE 'Syntax OK but unusual' TYPE 'S' DISPLAY LIKE 'W'. ELSE. MESSAGE 'Wrong Format' TYPE 'S' DISPLAY LIKE 'E'. ENDIF. "replace "替换指定范围字符串,off,len指定 "1.replace( val = text [off = off] [len = len] with = new ) "替换匹配字符子串 "如果off有值,len = 0,表示插入到off处; "如果len有值,off = 0,替换头部len长度字符串; "如果off等于字符串长度,len=0,表示将子串拼接到字符串后; "2.replace( val = text {sub = substring}|{regex = regex} with = new [case = case] [occ = occ] ) "occ指定替换次数 "示例: text = 'hello world! welcome china!'. text = replace( val = text off = 0 len = 5 with = 'hi' ). WRITE:/ 'replace:',text. "这里只替换第一次匹配的'!' text = replace( val = text sub = '!' with = '.' case = abap_true occ = 1 ). WRITE:/ 'replace:',text. "substring "返回子字符串 "1.substring( val = text [off = off] [len = len] ) "2.substring_from( val = text {sub = substring}|{regex = regex}[case = case] [occ = occ] [len = len] ) "3.substring_after( val = text {sub = substring}|{regex = regex}[case = case] [occ = occ] [len = len] ) "4.substring_before( val = text {sub = substring}|{regex = regex}[case = case] [occ = occ] [len = len] ) "5.substring_to( val = text {sub = substring}|{regex = regex}[case = case] [occ = occ] [len = len] ) text = 'ABCDEFGHJKLMN'. text = substring( val = text off = 0 len = 10 ). WRITE:/ 'substring:',text. "返回ABCDE,返回匹配子字符串,len指定返回长度 text = 'ABCDEFGHJKLMN'. text = substring_from( val = text sub = 'ABCDEF' case = abap_true occ = 1 len = 5 ). WRITE:/ 'substring:',text. "返回DEFGH,返回查找到字符串后面len长度部分 text = 'ABCDEFGHJKLMN'. text = substring_after( val = text sub = 'ABC' case = abap_true occ = 1 len = 5 ). WRITE:/ 'substring:',text. "返回DEFGH,返回查找到字符串前面len长度部分 text = 'ABCDEFGHJKLMN'. text = substring_before( val = text sub = 'JKL' case = abap_true occ = 1 len = 5 ). WRITE:/ 'substring:',text. "返回GHJKL,返回查找到字符串前面len长度部分(包含匹配字符串) text = 'ABCDEFGHJKLMN'. text = substring_to( val = text sub = 'JKL' case = abap_true occ = 1 len = 5 ). WRITE:/ 'substring:',text.
2.3使用cl_abap_regex,cl_abap_matcher
类cl_abap_regex,用来创建正则表达式,cl_abap_matcher,用来进行匹配,查找,替换等操作。
示例:
"使用类 "CL_ABAP_REGEX "CL_ABAP_MATCHER DATA:lo_matcher TYPE REF TO cl_abap_matcher. DATA:ls_match TYPE match_result. DATA:lv_match TYPE C LENGTH 1. "直接使用cl_abap_matcher类方法matches IF cl_abap_matcher=>matches( pattern = 'ABC.*' text = 'ABCDABCE' ) = abap_true. "返回静态实例 lo_matcher = cl_abap_matcher=>get_object( ). "获取匹配结果 ls_match = lo_matcher->get_match( ). "cl_abap_matcher的attribute "text:匹配的字符串 "table:匹配的table "regex:匹配的正则表达式 WRITE:/ 'cl_abap_matcher:',lo_matcher->text,ls_match-offset,ls_match-length. ENDIF. "创建matcher对象,然后匹配 lo_matcher = cl_abap_matcher=>create( pattern = 'A.*' ignore_case = abap_true text = 'ABC' ). "匹配结果,匹配‘X’,不匹配为空 lv_match = lo_matcher->match( ). WRITE:/ 'cl_abap_matcher:',lv_match. "创建cl_abap_regex,正则表达式对象 "通过regex对象创建matcher "DATA: lo_regex TYPE REF TO cl_abap_regex. CREATE OBJECT lo_regex EXPORTING pattern = '^add.*' ignore_case = abap_true. lo_matcher = lo_regex->create_matcher( text = 'addition' ). lv_match = lo_matcher->match( ). WRITE:/'cl_abap_matcher:',lv_match. "创建matcher对象,使用构造方法 DATA:t_result_tab TYPE MATCH_RESULT_TAB. DATA:s_result_tab TYPE MATCH_RESULT. CREATE OBJECT lo_regex EXPORTING pattern = 'A'. CREATE OBJECT lo_matcher EXPORTING REGEX = lo_regex TEXT = 'ABCDABCD'. t_result_tab = lo_matcher->find_all( ). LOOP AT t_result_tab INTO s_result_tab. WRITE:/ 'find_all:',s_result_tab-offset,s_result_tab-length. ENDLOOP.
本文来自博客园,作者:渔歌晚唱,转载请注明原文链接:https://www.cnblogs.com/tangToms/p/13828375.html