Python_正则(match&search&findall的引申)

 match&search&findall的引申

 

打印出匹配的内容

>>> re.match(r"\d","123").group()

'1'

>>> re.match(r"\D","a123").group()

'a'

>>> re.match(r"\D+","a123").group()

'a'

>>> re.match(r"\D+","abc123").group()

'abc

 

匹配空白/非空白

s(小写):匹配空白

S(大写):匹配非空白

>>> re.search(r"\s","ab cd")#匹配空格
<_sre.SRE_Match object; span=(2, 3), match=' '>


>>> re.search(r"\s+","ab\t    \r\ncd")
<_sre.SRE_Match object; span=(2, 9), match='\t    \r\n'>

 

>>> re.findall(r"\S+","ab cd\t ef\nhi")
['ab', 'cd', 'ef', 'hi']


  >>> "".join(re.findall(r"\S+","ab cd\t ef\nhi"))

  'abcdefhi'

 

w:匹配字符(大小写字母、_、数字)

>>> re.search(r"\w+","aaaZAW0123_")#匹配字符大小写、_
<_sre.SRE_Match object; span=(0, 11), match='aaaZAW0123_'>


>>> re.search(r"\w+","aaaZAW0123_").group()
'aaaZAW0123_'

 

>>> re.search(r"\W+","aaaZAW0123_-").group()
'-'

 

 限制贪婪的处理

>>> import re

>>> re.search(r"\d?","a7").group()

''

>>> re.search(r"\d?","7").group()
'7'


>>> re.search(r"\d+","a7").group()
'7'

 

 分组的引用处理

>>> re.search(r"\d{3}","123456789").group()#匹配3个
'123'

>>> re.search(r"\d{1,3}","123456789").group()#匹配1到3个 '123'

>>> re.search(r"\d{1,3}?","123456789").group()#0个或1个 '1' >>> re.search(r"\d{0,3}?","123456789").group() ''

 

 ^:开头匹配

>>> re.search(r"^abc","dddabc")#结果为空
>>>



re.search(r”\d*?”,”7”)#匹配了0个

 

>>> re.search(r"^abc","abcdddabc")
<_sre.SRE_Match object; span=(0, 3), match='abc'>

 

>>> re.search(r"^\d+","133dddabc")
<_sre.SRE_Match object; span=(0, 3), match='133'>

 

加上$表示匹配结尾的数字

>>> re.search(r"\d+$","133dddabc5555")
<_sre.SRE_Match object; span=(9, 13), match='5555'>

 

 "^1XXX$"掐头去尾的匹配

#  "^123$"掐头去尾匹配的结果都是123,且只能是“123”

>>> re.search(r"^123$","123")
<_sre.SRE_Match object; span=(0, 3), match='123'>


>>> re.search(r"^123$","123sss")#匹配结果为空


>>> re.search(r"^123$","ss123")#匹配结果为空

 

#:等价于掐头去尾

>>> re.search(r"\A123\Z","123")

<_sre.SRE_Match object; span=(0, 3), match='123'>

 

>>> re.search(r"\d(\D+)\d","1abc3").group(1)
'abc'


>>> re.search(r"(\d)(\D+)(\d)","1abc3").group(1)
'1'

 

组合匹配

>>> re.search(r"(\d)(\D+)(\d)","1abc3").group(2)
'abc'
>>> re.search(r"(\d)(\D+)(\d)","1abc3").group(2)
'abc'
>>> re.search(r"(\d)(\D+)(\d)","1abc3").group(3)
'3' 

 

细节规则

re.I表示忽略大小写

re.M表示将字符串视为多行,从而^匹配每一行的行首,$匹配每一行的行尾

re.S (不包含外侧双引号,下同)的作用扩展到整个字符串,包括“\n”

?限制贪婪

*0个或一个

+一个或多个

 

posted @ 2019-04-01 00:18  翻滚的小强  阅读(221)  评论(0编辑  收藏  举报