python正则表达式

match开头~if group~忽略　　search中间~if group~0　　findall列表　　+*?　　re.sub()替换

一、re模块

　参考：https://www.ibm.com/developerworks/cn/opensource/os-cn-pythonre/index.html

　\s：匹配任意空白字符

　+：与之紧邻的元素出现一次或者多次

　*：与之紧邻的元素出现任意次，或者不出现

　?：与之紧邻的元素出现0或者1次

　.：匹配任意1个字符（除了\n）

二、实例

　1）re.match()：match开头，if group，忽略

import re
s = 'what are you doing are?'
if re.match('are',s):
    print "touched"
#匹配不到

if re.match('what', s):
    print "touched"
#能匹配到，因为开头是what

#group
a = re.match('what', s)
a.group()
#'what'，开头匹配到what，则a.group()就返回what

#若开头没匹配到what，则a为空，调用a.group()会报错

　还可以忽略大小写：

import re
if re.match('gene',"Gene_name",re.IGNORECASE):
       print('yes')

　2）re.search()：search中间，if group

import re
s = 'what are you doing are?'
if re.search('are',s):
    print "touched"
#用search就可以匹配中间的字符了

　　获取匹配结果：

#如果想获取匹配的结果，group()就派上用途了
target = '<strong>115</strong>'
re.search('(>)([0-9]*)(<)',target).group(1)
#'>'
re.search('(>)([0-9]*)(<)',target).group(2)
#'115'
re.search('(>)([0-9]*)(<)',target).group(3)
#'<'
re.search('(>)([0-9]*)(<)',target).group(0)
#'>115<'
'''
1.group() 同group（0）就是匹配正则表达式整体结果
2.group(1) 列出第一个括号匹配部分，group(2) 列出第二个括号匹配部分，group(3) 列出第三个括号匹配部分
'''

　3）re.findall()：返回列表

import re
content = 'Hello 123456789 Word_This is just a test 666 Test'
results = re.findall('\d+', content) 
print(results)
#['123456789', '666']
#如果没有匹配到，则返回空

　4）re.sub()：字符串替换

import re
line = '帮同事订的，以下是他给我反馈：It was excellent and I can recommend it to anyone else to stay there'
>>> re.sub('[a-zA-Z0-9]', '', line)
#'帮同事订的，以下是他给我反馈：             '

import re
s = 'movies of the summer.<br /><br />Robin Williams'

re.sub('<[^>]*>', '', s)
#'movies of the summer.Robin Williams'

　5）re.finditer()：获取匹配到的关键词的所有下标

import re
s = '奥迪你把奥迪厉害宝马奥迪奥'
pattern = re.compile('奥迪|宝马')
result = pattern.finditer(s)
for each in result:
    print(each.span(), each.group())

posted @ 2019-03-25 11:37 1直在路上1 阅读(219) 评论(0) 编辑收藏举报

刷新页面返回顶部

1直在路上1

python正则表达式

公告