在python中使用正则表达式(二)

这一节主要学习一下compile()函数和group()方法

1. re.compile()

compile 函数用于编译正则表达式，生成一个正则表达式（ Pattern ）对象，然后就可以用编译后的正则表达式去匹配字符串

语法如下：
>>> help(re.compile)
Help on function compile in module re:

compile(pattern, flags=0)
    Compile a regular expression pattern, returning a pattern object.
>>>

pattern : 一个字符串形式的正则表达式 
flags ：可选，表示匹配模式，比如忽略大小写，多行模式等

示例：

>>> test_pattern = re.compile(r'\d{2}')   # 编译一个正则表达式，并将其赋给一个变量
>>> m = test_pattern.match('12bc34')   # 使用编译后的正则表达式对象直接匹配字符串
>>> m
<_sre.SRE_Match object; span=(0, 2), match='12'>

>>> test_pattern = re.compile(r'a\w+')  # 生成一个正则表达式对象(这里是匹配以a开头的单词)
>>> m = test_pattern.findall('apple,blue,alone,shot,attack') # 使用findall()函数匹配所有满足匹配规则的子串
>>> m
['apple', 'alone', 'attack']

2.group()和groups()

一般用match()或search()函数匹配，得到匹配对象后，需要用group()方法获得匹配内容；同时也可以提取分组截获的字符串（正则表达式中()用来分组）

示例：

>>> pattern = re.compile(r'^(\d{3})-(\d{3,8})$')  # 匹配一个3位数开头，然后一个-，然后跟着3-8位数字的字符串
>>> m = pattern.match('020-1234567')
>>> m
<_sre.SRE_Match object; span=(0, 11), match='020-1234567'>
>>> m.group()   #  显示整个匹配到的字符
'020-1234567'
>>> m.group(0)  # 同样是显示整个匹配到的字符
'020-1234567'  
>>> m.group(1)   # 提取第1个分组中的子串
'020'
>>> m.group(2)   # 提取第2个分组中的子串
'1234567'
>>> m.group(3)   # 因为不存在第3个分组，所以这里会报错：没有这样的分组
Traceback (most recent call last):
  File "<pyshell#73>", line 1, in <module>
    m.group(3)
IndexError: no such group

>>> m.groups()
('020', '1234567')
>>>

2018-06-07 22:43:46

posted @ 2018-06-07 22:40 我是冰霜阅读(1910) 评论(0) 收藏举报

刷新页面返回顶部

我是冰霜

I am just a sunflower, waiting for my only sunshine.

在python中使用正则表达式(二)

1. re.compile()

2.group()和groups()

公告