LeetCode 10: Regular Expression Matching
Description:
Given an input string (s
) and a pattern (p
), implement regular expression matching with support for '.'
and '*'
.
'.' Matches any single character. '*' Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
Note:
s
could be empty and contains only lowercase lettersa-z
.p
could be empty and contains only lowercase lettersa-z
, and characters like.
or*
.
Example 1:
Input: s = "aa" p = "a" Output: false Explanation: "a" does not match the entire string "aa".
Example 2:
Input: s = "aa" p = "a*" Output: true Explanation: '*' means zero or more of the precedeng element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".
Example 3:
Input: s = "ab" p = ".*" Output: true Explanation: ".*" means "zero or more (*) of any character (.)".
Example 4:
Input: s = "aab" p = "c*a*b" Output: true Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore it matches "aab".
Example 5:
Input: s = "mississippi" p = "mis*is*p*." Output: false
描述:
给定字符串(s)和模式(p),实现支持'*’和‘.’的正则表达式匹配。
‘.’:匹配任意字符 ‘*’:匹配零个或多个前缀字符
要求:模式匹配覆盖整个输入字符,而不是部分字符。
输入字符串s为空,或者只包含小写字母a-z;
模式p可以为空,或者只包含小写字母a-z和特殊字符'*'、‘.’。
例子1:
输入: s = "aa" p = "a" 输出: false 说明: "a"不能匹配整个输入字符串"aa".
例子2:
输入: s = "aa" p = "a*" 输出: true 说明: '*'代表0个或多个前缀字符'a'. 因此, 重复一次'a' , 变为"aa".
例子3:
输入: s = "ab" p = ".*" 输出: true Explanation: ".*" 表示0个或多个 (*) 任意字符 (.)。
例子4:
输入: s = "aab" p = "c*a*b" 输出: true 说明: c可以出现0次, a可以出现两次. 因此匹配输入字符"aab"。
例子5:
输入: s = "mississippi" p = "mis*is*p*." 输出: false
方法一:迭代法
首先,我们考虑不包含字符‘*’的情形,‘.’可以和任意字符匹配。我们首先判断第一个字符是否相等,如果相等则递归判断剩下的字符是否相等。
代码如下:
class Solution { public: bool isMatch(string s, string p) { if(p.length() == 0) return s.length() == 0; bool first_match = (s.length() != 0 && (s[0] == p[0] || p[0] == '.')); return first_match && isMatch(s.substr(1), p.substr(1)); } };
接着,我们考虑存在‘*’的情形。这里存在两种情况:
1. 输入字符的首字符和模式的首字符不匹配,如s=“abc”,p=“c*abc”,此时跳过模式p的前两个字符,进行后续比较,即输入s=“abc”,p="abc"。
2. 输入字符的首字符和模式的首字符匹配,如s=“abc”,p=“a*bc”,由于‘*’可以表示多个前缀字符,此时跳过输入字符的首字符,进行后续比较,即输入s=“bc”,p="a*bc"。
class Solution { public: bool isMatch(string s, string p) { if(p.length() == 0) return s.length() == 0; bool first_match = (s.length() != 0 && (s[0] == p[0] || p[0] == '.')); if(p.length() >= 2 && p[1] == '*') { return isMatch(s, p.substr(2)) || (first_match && isMatch(s.substr(1), p)); } else { return first_match && isMatch(s.substr(1), p.substr(1)); } } };
复杂度分析:
- 时间复杂度: 以T和P分别表示输入字符串s和模式字符串p的长度。最坏情形下,对函数
match(text[i:], pattern[2j:])
的调用次数为:will be made \binom{i+j}{i}(ii+j) times, and strings of the order O(T - i)O(T−i) and O(P - 2*j)O(P−2∗j) will be made. Thus, the complexity has the order \sum_{i = 0}^T \sum_{j = 0}^{P/2} \binom{i+j}{i} O(T+P-i-2j)∑i=0T∑j=0P/2(ii+j)O(T+P−i−2j). With some effort outside the scope of this article, we can show this is bounded by O\big((T+P)2^{T + \frac{P}{2}}\big)O((T+P)2T+2P). -
Space Complexity: For every call to
match
, we will create those strings as described above, possibly creating duplicates. If memory is not freed, this will also take a total of O\big((T+P)2^{T + \frac{P}{2}}\big)O((T+P)2T+2P) space, even though there are only order O(T^2 + P^2)O(T2+P2) unique suffixes of PP and TT that are actually required.