leetcode笔记 动态规划在字符串匹配中的应用

leetcode笔记 动态规划在字符串匹配中的应用

0 参考文献

序号 标题
1 一招解决4道leetcode hard题,动态规划在字符串匹配问题中的应用
2 10.Regular Expression Matching

1. [10. Regular Expression Matching]

1.1 题目

Given an input string (s) and a pattern (p), implement regular expression matching with support for '.' and '*'.

'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

Note:

  • s could be empty and contains only lowercase letters a-z.
  • p could be empty and contains only lowercase letters a-z, and characters like . or *.

Example 1:

Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".

Example 2:

Input:
s = "aa"
p = "a*"
Output: true
Explanation: '*' means zero or more of the precedeng element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".

Example 3:

Input:
s = "ab"
p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".

Example 4:

Input:
s = "aab"
p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore it matches "aab".

Example 5:

Input:
s = "mississippi"
p = "mis*is*p*."
Output: false

1.2 思路 && 解题方法

dp数组

首先建立一个二维数组,数P组的列是代表了字符串S,数组的行代表了字符串P。dp[i] [j] 表示P[0:i] 匹配S[0:j]。因此如果最后P能够匹配S,则dp [len(P)] [len(S)] == True 。注意dp[0] [0] 分别是S是空字符和P是空字符的时候。 这个时候是必定匹配的,因此dp [0] [0] = True。

之后需要做的事情就是依次填满这个矩阵。为此需要初始化dp [ 0 ] [ j ] 和dp [ i ] [ 0 ]既第0行和第0列。

  1. 对于第0行因为P为空,则除了S是空以外其他的都不匹配。因此 dp [ 0 ] [ j ] = False

  2. 对于第0列,则需要判断下P是否能匹配S是空串的情况。在S是空串的情况下,之后P是空串或者P是带有 " * "的情况下才能匹配,因此只需要处理这两种情况。

    1. P是空串的情况下,可以匹配S。因此dp [ 0 ] [ 0 ] = True
    2. P是" * "的情况下,例如"abc",因为可以是匹配0个或者多个字符。因此当在这种情况下,*号其实可以消掉前面的字符变成""。因此dp [ i ] [ 0 ] = dp [ i-2 ] [ 0 ] and P [ i -1 ] == " * "。这里为什么不是判断dp [ i -1 ] [ 0 ] 是否为True 而是判断dp [ i-2 ] [ 0 ]呢?是因为dp [ i-1 ] [ 0 ]是表示 P [ 0 : i - 2 ] 能够匹配S [ 0 ],如果P [ 0 : i -2 ]能够匹配S [ 0 ],那么当前字符" * " 消掉前一个字符便无法匹配S [ 0 ] (既dp [ i ] [ 0 ] == False)。如下图的示例,当P="a * b * " 是可以匹配""空字符的。那么当i = 2 ,j = 0 的时候,必须有dp [ 0 ] [ 0 ] == True 才能得到 dp [ 2 ] [ 0 ] == True。

1557325258465

到此,dp矩阵的初始化已经好了。这个时候,矩阵中的值如图所示。绿色部分是已经初始化的值。空白的部分是待填充的。

1557325258465

接下来就是填充dp矩阵的剩余部分。对于dp [ i ] [ j ] ( i>1, j>1)会有以下的几种情况:

  1. P [ i - 1 ] == " * " :

    对于这种情况,还可以区别2中情况:

    1. " * " 抵消前面的字符,既 " * "匹配空字符串:

      对于这种情况则和前文所述的方法一样,dp [ i ] [ j ] == dp [ i-2 ] [ j ]

    2. " * "匹配前面的字符N次 :

      对于这种情况,则需要在 ( ( P[ i - 1 ] == " . " ) or ( S[ j -1 ] == P [ i -2 ] ) )的情况下 ,dp [ i -1 ] [ j ] == True。这是为什么呢?原因在于如果要匹配0-N次,则代表了P[ 0 - i -2 ] (既dp [ i -1 ] [ XXXX ] ) 能完全匹配S[ 0 : j - 1 ]。

      如例子中的 "a." 能匹配 "abb"。

  2. P [ i - 1 ] == " . " or P [ i - 1 ] == " 一个正常的字符 " :

    如果是这种情况见简单的多,既( S [ j - 1 ] == P [ j -1 ] or P [ j -1 ] == " . " ) and dp [ i - 1 ] [ j -1 ] == True 。

1557325258465

1.3 实现

class Solution(object):
    def isMatch(self, s, p):
        """
        :type s: str
        :type p: str
        :rtype: bool
        """
        # dp[i][j] 代表了p字符串从0-i是否匹配s字符的0-j
        row = len(p) + 1
        col = len(s) + 1
        dp = [ [False for i in range( col ) ] for j in range( row ) ]
        dp[0][0] = True # dp[0][0] 代表了p是空串 s是空串
        # 当s时空串的情况下,p的不同,匹配的不同情况。为接下去匹配len(s) = 1 ,2 ,3 .... n 做准备
        # 当s为空串的时候,只有a*b*这种能匹配。
        # 因此dp[0][0] 为空串,所以i-1实际真正指向p的一个字符串的位置
        for i in range( 1, row):
            dp[i][0] = ( i > 1 ) and p[ i - 1 ] == "*" and dp[ i - 2 ][0]

        for i in range( 1, row ) :
            for j in range( 1, col ):
                
                if p[ i - 1 ] =="*":
                    dp[i][j] = dp[ i - 2 ][j] or ( p[ i - 2  ] == s[ j - 1 ] or p[ i - 2 ] == ".") and  dp[i][j-1]

                else:
                    dp[i][j] = ( p[ i - 1 ] == "."  or p[ i - 1 ]  == s[ j - 1 ]) and dp[i-1][j-1]

        return dp[row-1][col-1]
        

2. [44. Wildcard Matching]

2.1 题目

Given an input string (s) and a pattern (p), implement wildcard pattern matching with support for '?' and '*'.

'?' Matches any single character.
'*' Matches any sequence of characters (including the empty sequence).

The matching should cover the entire input string (not partial).

Note:

  • s could be empty and contains only lowercase letters a-z.
  • p could be empty and contains only lowercase letters a-z, and characters like ? or *.

Example 1:

Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".

Example 2:

Input:
s = "aa"
p = "*"
Output: true
Explanation: '*' matches any sequence.

Example 3:

Input:
s = "cb"
p = "?a"
Output: false
Explanation: '?' matches 'c', but the second letter is 'a', which does not match 'b'.

Example 4:

Input:
s = "adceb"
p = "*a*b"
Output: true
Explanation: The first '*' matches the empty sequence, while the second '*' matches the substring "dce".

Example 5:

Input:
s = "acdcb"
p = "a*c?b"
Output: false

2.2 思路 && 解题方法

这道题和前面的那道题思路是一样的,也是维护一个二维数组dp 来解题。只不过这里匹配任意字符的符号换成了"?" ,而" * "现在是可以匹配任意序列包括空字符串。 同样的假设当前S= "abc" P="a?b*",则有如下的dp数组:

1557325258465

同样首先初始化第0行和第0列。

  1. 对于第0行很好处理,除了0,0位置,其他的地方全部都是False
  2. 对于第0列,因为S=""因此只有当遇到了" * "的时候,才能匹配。则匹配的条件是dp [ i -1 ] [ 0 ] == True 。

开始填充dp的时候,也是有如下的2种情况:

  1. 当P [ i-1 ] == " * " :

    这种情况下也分2种情况:

    1. " * "当做空字符串使用:则和前述一样 dp [ i ] [ j ] =( dp [ i -1 ] [ j ] == True )
    2. " * "当做任意字符串使用 : 则 dp [ i ] [ j ] = ( dp [ i ] [ j - 1 ] == True)。这里解释下我的理解。对于" * "当做任意字符使用的情况下,dp [ i ] [ j - 1 ] == True 表示的是:P [ 0 : i-2 ] 匹配了 S [ 0 : j -2 ],同时P[ i - 1 ] (当前是 * )当做空子串使用。
  2. 当P [ i-1 ] == " ? " 或 " 一个正常的字符 ":

    则( S [ j - 1 ] == P [ j -1 ] or P [ j -1 ] == " ?" ) and dp [ i - 1 ] [ j -1 ] == True

1557325258465

2.3 实现

#!/bin/python

class Solution(object):
    def isMatch(self, s, p):
        """
        :type s: str
        :type p: str
        :rtype: bool
        """
        row = len(p) + 1
        col = len(s) + 1
        dp = [ [False for i in range( 0, col )] for j in range( 0, row ) ]

        dp[0][0] = True
        for j in range(1, col ):
            dp[0][j] = False
        for i in range( 1, row):
            if p[i-1] == "*":
                dp[i][0] = dp[i-1][0]

        for i in range( 1, row ):
            for j in range( 1, col ):
                if p[i-1] == "*":
                    dp[i][j] = dp[i-1][j] or dp[i][j-1]
                else:
                    dp[i][j] = (s[j-1] == p[i-1] or p[i-1] == "?") and dp[i-1][j-1]
        return dp[row-1][col-1]

if __name__ == "__main__":
    m = Solution()
    print "s:[aa],p[a] ret:"+str(m.isMatch("aa","a"))
    print "s:[aa],p[*] ret:"+str(m.isMatch("aa","*"))
    print "s:[cb],p[?a] ret:"+str(m.isMatch("cb","?a"))
    print "s:[],p[] ret:"+str(m.isMatch("",""))
    print "s:[acdcb],p[a*c?b] ret:"+str(m.isMatch("acdcb","a*c?b"))
    print "s:[adceb],p[*a*b] ret:"+str(m.isMatch("adceb","*a*b"))

3. [97. Interleaving String]

3.1 题目

Given s1, s2, s3, find whether s3 is formed by the interleaving of s1 and s2.

Example 1:

Input: s1 = "aabcc", s2 = "dbbca", s3 = "aadbbcbcac"
Output: true

Example 2:

Input: s1 = "aabcc", s2 = "dbbca", s3 = "aadbbbaccc"
Output: false

3.2 思路 && 解题方法

本题的思路还是一样的,维护一个动态数组dp 。dp [ i ] [ j ] 表示 S1 [ i ] 和S2 [ j ] 能匹配 S3 [ i + j - 1] 。假设S1="aa" S2="ab" S3="aaba" , 则从dp [ 0 ] [ 0 ]开始,往右一步代表使用S2 [ j ] 表示S3 [ 0 + j ] ( 0 是 i 因为此处是第一行,所以i是0)。同样往下一步 代表使用S1[ i ] 表示 S3 [ i + 0 ] 。因此如果可以到达dp [ i ] [ j ]则 需要 dp [ i -1 ] [ j ]或者 dp [ i ] [ j -1 ]是1 。

1557325258465

3.3 实现

#!/bin/bash

class Solution(object):
    def isInterleave(self, s1, s2, s3):
        """
        :type s1: str
        :type s2: str
        :type s3: str
        :rtype: bool
        """
        row = len(s1) + 1
        col = len(s2) + 1
        t = len(s3)
        if row + col -2 !=t :
            return False
        dp = [ [False for j in range(col)] for j in range(row) ]
        dp[0][0] = True
        for j in range(1,col):
            dp[0][j] = dp[0][j-1] and s2[j-1] == s3[j-1]
        for i in range(1,row):
            dp[i][0] = dp[i-1][0] and s1[i-1] == s3[i-1]

        for i in range(1,row):
            for j in range(1,col):
       
                dp[i][j] = ( dp[i-1][j] and s1[i-1] == s3[i+j-1]) or (dp[i][j-1] and s2[j-1] == s3[i+j-1])
        return dp[row-1][col-1]

if __name__=='__main__':
    m = Solution()
    print m.isInterleave("a","b","a")
posted @ 2019-05-16 23:08  bush2582  阅读(1040)  评论(0编辑  收藏  举报