alex_bn_lee

导航

< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5

统计

[951] Understanding the pattern of "(.*?)" in Python's re package

In Python's regular expressions, (.*?) is a capturing group with a non-greedy quantifier. 

Let's break down the components:

  1. ( and ): Parentheses are used to create a capturing group. This allows us to capture a portion of the matched text.
  2. .*?: Inside the capturing group, .*? is a non-greedy quantifier that matches any character (except for a newline) zero or more times. The * means "zero or more occurrences", and the ? makes the * non-greedy, meaning it will match as few characters as possible while still allowing the overall pattern to match.
    So, (.*?) is capturing any sequence of characters (including an empty sequence) but doing so in a non-greedy way. This is useful when we want to capture the shortest possible substring that allows the overall pattern to match.

Here is a brief example to illustrate the difference between greedy and non-greedy quantifiers:

import re
text = "abc123def456ghi"
# Greedy match
greedy_match = re.search(r'(.*)\d', text)
if greedy_match:
print("Greedy match:", greedy_match.group(1)) # Output: abc123def45
# Non-greedy match
non_greedy_match = re.search(r'(.*?)\d', text)
if non_greedy_match:
print("Non-greedy match:", non_greedy_match.group(1)) # Output: abc

In the greedy match, (.*)\d captures as much as possible before the last digit, while in the non-greedy match, (.*?)\d captures as little as possible before the first digit. The non-greedy approach is often useful when you want to extract the shortest substring between two specific patterns.

posted on   McDelfino  阅读(7)  评论(0编辑  收藏  举报

相关博文:
阅读排行:
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· .NET10 - 预览版1新功能体验(一)
历史上的今天:
2022-11-22 【770】热点分析、Emerging Hotspot Analysis、P值、Z得分
2012-11-22 【092】罗马数字 XXII.XI.MMXII
点击右上角即可分享
微信分享提示