Python3中正则模块re.compile、re.match及re.search函数

参考：https://www.jb51.net/article/141830.htm

官网：https://docs.python.org/3/library/re.html

re.compile() 函数
编译正则表达式模式，返回一个对象。可以把常用的正则表达式编译成正则表达式对象，方便后续调用及提高效率。
re.compile(pattern, flags=0)
    pattern 指定编译时的表达式字符串
    flags 编译标志位，用来修改正则表达式的匹配方式。支持 re.L|re.M 同时匹配
flags 标志位参数
re.I(re.IGNORECASE)
使匹配对大小写不敏感
re.L(re.LOCAL) 
做本地化识别（locale-aware）匹配
re.M(re.MULTILINE) 
多行匹配，影响 ^ 和 $
re.S(re.DOTALL)
使 . 匹配包括换行在内的所有字符
re.U(re.UNICODE)
根据Unicode字符集解析字符。这个标志影响 \w, \W, \b, \B.
re.X(re.VERBOSE)
该标志通过给予你更灵活的格式以便你将正则表达式写得更易于理解。

在python 3.7版本==》 pattern= re.compile('<p\sclass="s2">(.*?)</p>',RegexFlag.S)

__version__ = "2.2.1"

class RegexFlag(enum.IntFlag):
    ASCII = sre_compile.SRE_FLAG_ASCII # assume ascii "locale"
    IGNORECASE = sre_compile.SRE_FLAG_IGNORECASE # ignore case
    LOCALE = sre_compile.SRE_FLAG_LOCALE # assume current 8-bit locale
    UNICODE = sre_compile.SRE_FLAG_UNICODE # assume unicode "locale"
    MULTILINE = sre_compile.SRE_FLAG_MULTILINE # make anchors look for newline
    DOTALL = sre_compile.SRE_FLAG_DOTALL # make dot match newline
    VERBOSE = sre_compile.SRE_FLAG_VERBOSE # ignore whitespace and comments
    A = ASCII
    I = IGNORECASE
    L = LOCALE
    U = UNICODE
    M = MULTILINE
    S = DOTALL
    X = VERBOSE
    # sre extensions (experimental, don't rely on these)
    TEMPLATE = sre_compile.SRE_FLAG_TEMPLATE # disable backtracking
    T = TEMPLATE
    DEBUG = sre_compile.SRE_FLAG_DEBUG # dump pattern after compilation
globals().update(RegexFlag.__members__)

# sre exception
error = sre_compile.error

//查找所有

import re
content = 'Citizen wang , always fall in love with neighbour，WANG'
rr = re.compile(r'wan\w', re.I) # 不区分大小写
print(type(rr))
a = rr.findall(content)
print(type(a))
print(a)

是否包含：

import re
str_content = "Python is a good language， sea say this is a game "  # 要匹配的内容, 对应match 里面的string
re_content = re.match(".*good1|.*game1", str_content, re.I)
if re_content:
    print("hahahaha")

正则查找替换

url = "http://sea.com?newest=1&size=10"

# 查找
newest = int(re.search(r"newest=(\d+)", new_url, re.M | re.I).group(1))
＃　替换
 new_url = re.sub(r"newest=\d+", "newest="+str(newest), new_url)

posted on 2019-11-19 16:30 lshan 阅读(536) 评论(0) 编辑收藏举报