正则表达式

Re模块

匹配查找方法

`re.match(表达式，字符串)`：从字符串的起始位置开始匹配正则表达式，如果匹配成功，则返回一个匹配对象；否则返回`None`。

1 import re
2 pattern = r'hello'
3 string = 'hello world'
4 match = re.match(pattern, string)
5 if match:
6     print(match.group())#hello

group()用来返回对象的子符串

# 示例 3：使用分组捕获信息
text = "2024-10-01"
pattern = r'(\d{4})-(\d{2})-(\d{2})'
match = re.search(pattern, text)
if match:
    year = match.group(1)
    month = match.group(2)
    day = match.group(3)
    print(f"Year: {year}, Month: {month}, Day: {day}")
    # 输出: Year: 2024, Month: 10, Day: 01

re.search()：在字符串中搜索匹配正则表达式的第一个位置，如果找到匹配，则返回一个匹配对象；否则返回None。它会扫描整个字符串，而不仅仅是从起始位置开始。

pattern = r'world'
string = 'hello world'
match = re.search(pattern, string)
if match:
    print(match.group())#world

re.findall()：在字符串中找到所有匹配正则表达式的子串，并以列表形式返回。如果没有找到匹配项，则返回一个空列表。

pattern = r'\d'
string = 'hello world123456789'
match = re.findall(pattern, string)
print(match)#['1', '2', '3', '4', '5', '6', '7', '8', '9']

re.finditer()：与re.findall()类似，但它返回的是一个迭代器，其中每个元素都是一个匹配对象，通过迭代可以逐个获取匹配的子串及其相关信息。

re.sub()：用于在字符串中替换匹配正则表达式的子串。它接受三个参数：正则表达式模式、替换的字符串、要处理的原始字符串，还可以指定替换的次数。

pattern = r'\d+'
replacement = 'X'
string = 'abc123def456ghi'
new_string = re.sub(pattern, replacement, string)
print(new_string)

re.split()：根据匹配正则表达式的子串来分割字符串，返回分割后的字符串列表。

re.compile()：将正则表达式模式编译为一个正则表达式对象，以便在后续的匹配操作中重复使用，提高效率。

# @Time  :2025年02月27日 0027 下午 8:21
import re
# pattern = r'hello'
# string = 'hello world'
# match = re.match(pattern, string)
# if match:
#     print(match.group())#hello
import re

import re

pattern = r'\d+'
replacement = 'X'
string ="""
         '<span style="font">Le Petit tkm</span>' 
         '<span style="tkm">Le Petit jk</span>' 
         '<span style="fo">Le Petit ert</span>' 
         '<span style="font-size:12px;">Le Petit ddf</span>' 
         '<span style="-size:;">Le Petit ince</span>'
         """
#   '<span style="font">Le Petit tkm</span>'

#"""'<span style=".*?">Le Petit .*?(?P<name>.*?)</span>"""
#首先<span style=".*?">Le Petit 最少的一组，设为A(是前一组.*?的结果)，A.*?(?P<name>.*?)</span>
#A.*?(?P<name>.*?)</span>， 找A.*?(?P<name> 最少的一组 设为B(是A.*?(?P<name>的结果)
#B*?)</span>，设为C，整套流程下来，<span>..</span>直接的内容就全获取了，最后用(?P<name>.*?)取出想要的值
tem=re.compile("""'<span style=".*?">Le Petit .*?(?P<name>.*?)</span>""",re.S)
tem2=re.compile("""'<span style=".*?(?P<age>.*?)">Le Petit .*?</span>""",re.S)

resp=re.finditer(tem,string)
for i in resp:
    print(i.group("name"))
resp2=re.finditer(tem2,string)
for j in resp2:
    print(j.group("age"))

posted @ 2025-02-28 19:14 不知名de菜鸟阅读(25) 评论(0) 收藏举报

刷新页面返回顶部

正则表达式

Re模块

匹配查找方法

re.match(表达式，字符串)：从字符串的起始位置开始匹配正则表达式，如果匹配成功，则返回一个匹配对象；否则返回None。

公告

`re.match(表达式，字符串)`：从字符串的起始位置开始匹配正则表达式，如果匹配成功，则返回一个匹配对象；否则返回`None`。