python - match 正则(2)贪婪模式和非贪婪模式
content02 = 'Hello 1234567 World_This is a Regex Demo' result03 = re.match('^Hello.*(\d+).*Demo',content02) #贪婪模式 print(result03) print(result03.group()) print(result03.group(1)) result04 = re.match('^Hello.*?(\d+).*Demo',content02) #非贪婪模式 print(result04) print(result04.group()) print(result04.group(1))
结果:
<re.Match object; span=(0, 41), match='Hello 1234567 World_This is a Regex Demo'>
Hello 1234567 World_This is a Regex Demo
7
<re.Match object; span=(0, 41), match='Hello 1234567 World_This is a Regex Demo'>
Hello 1234567 World_This is a Regex Demo
1234567
PS E:\learning\python\spider_leaning>
但是需要注意的是,如果匹配的结果在字符串的结尾,.*?就有可能匹配不到任何内容,如下:
content05 = 'http://weibo.com/comment/kEraCN' result05_1 = re.match('^http.*?comment/(.*?)',content05) result05_2 = re.match('^http.*?comment/(.*)',content05) print('result05_1:',result05_1.group(1)) print('result05_2:',result05_2.group(1))
结果:
result05_1:
result05_2: kEraCN