动态规划中的子序列子串相关问题

涉及到求子序列或子串,以及编辑距离等问题,可以用动态规划解决。下面详细探讨一下这个问题。

子序列子串问题分类

这类问题分主要分三大类

  1. 连续子数组或子串:最大连续子序和、最长回文子串
  2. 不连续的子序列或子串:最长递增子序列、最长回文子序列
  3. 涉及两个字符串/数组:最长公共子序列、最小编辑距离

一般可以用动态规划中的一维dp数组和二位dp数组解决。

一维dp数组

状态方程中,dp[i]的结果只与dp[i-1]相关,或者与dp[0]到dp[i-1]都相关。

最大连续子序和
dp[i]表示以nums[i]为结尾最大子数组和,状态转移方程为:

d p [ i ] = m a x ( d p [ i − 1 ] + n u m s [ i ] , n u m s [ i ] ) dp[i] = max(dp[i-1]+nums[i], nums[i]) dp[i]=max(dp[i1]+nums[i],nums[i])

对于最大连续子序和这个问题,最终的结果还要求dp数组的最大值。

def maxSubArray(nums):
    """
    最大连续子序和
    """
    n = len(nums)
    if n == 0: return 0
    dp = [0] * n
    dp[0] = nums[0]
    for i in range(1, n):
        dp[i] = max(dp[i-1]+nums[i], nums[i])
    res = max(dp)
    return res

最长递增子序列
dp[i] 表示以nums[i]这个数结尾的最长递增子序列的长度,状态转移方程为:
d p [ i ] = { m a x ( d p [ j ] ) , n u m s [ i ] < = n u m s [ j ] , 0 < j < i m a x ( d p [ j ] ) + 1 n u m s [ i ] > n u m s [ j ] , 0 < j < i dp[i] = \left\{\begin{matrix} max(dp[j]), & nums[i]<=nums[j], 0<j<i\\ max(dp[j])+1 & nums[i]>nums[j], 0<j<i \end{matrix}\right. dp[i]={max(dp[j]),max(dp[j])+1nums[i]<=nums[j],0<j<inums[i]>nums[j],0<j<i
与上面的类似,最终的结果还要求dp数组的最大值。

def lengthOfLIS(nums):
    """
    最长递增子序列
    dp[i]表示以nums[i]这个数结尾的最长递增子序列的长度
    """
    dp = [1] * len(nums)
    for i in range(1, len(nums)):
        for j in range(0, i):
            if nums[i] > nums[j]:
                dp[i] = max(dp[i], dp[j]+1)
    res = max(dp)
    return res

二维dp数组

涉及到两个字符串或数组时,一般都是用二位dp数组。状态方程中dp[i][j]的结果与dp[i-1][j-1],dp[i-1][j],dp[i][j-1]等都可能有关系。

最长公共子序列
dp[i][j]表示s1[0:i]与s2[0:j]的最长公共子序列,左闭右开区间,状态转移方程:
d p [ i ] [ j ] = { d p [ i − 1 ] [ j − 1 ] + 1 , s 1 [ i ] = = s 2 [ j ] m a x ( d p [ i − 1 ] [ j ] , d [ i ] [ j − 1 ] ) , s 1 [ i ] ! = s 2 [ j ] dp[i][j] = \left\{\begin{matrix} dp[i-1][j-1]+1, & s1[i] == s2[j]\\ max(dp[i-1][j],d[i][j-1]), & s1[i] != s2[j] \end{matrix}\right. dp[i][j]={dp[i1][j1]+1,max(dp[i1][j],d[i][j1]),s1[i]==s2[j]s1[i]!=s2[j]

def longestCommonSubsequence(str1, str2) -> int:
    m, n = len(str1), len(str2)
    dp = [[0]*(n+1) for i in range(m+1)]
    for i in range(1, m+1):
        for j in range(1, n+1):
            if str1[i-1] == str2[j-1]:
                dp[i][j] = dp[i-1][j-1] + 1
            else:
                dp[i][j] = max(dp[i-1][j], dp[i][j-1])
    return dp[m][n]

最小编辑距离
dp[i][j]表示s1[0:i]和s2[0:j]的最小编辑距离,左闭右开区间,状态转移方程:
d p [ i ] [ j ] = { d p [ i − 1 ] [ j − 1 ] , s 1 [ i − 1 ] = = s 2 [ j − 1 ] m i n ( d p [ i − 1 ] [ j ] , d p [ i ] [ j − 1 ] , d p [ i − 1 ] [ j − 1 ] ) + 1 , s 1 [ i − 1 ] ! = s 2 [ j − 1 ] dp[i][j] = \left\{\begin{matrix} dp[i-1][j-1], & s1[i-1] == s2[j-1]\\ min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1]) + 1, & s1[i-1] != s2[j-1] \end{matrix}\right. dp[i][j]={dp[i1][j1],min(dp[i1][j],dp[i][j1],dp[i1][j1])+1,s1[i1]==s2[j1]s1[i1]!=s2[j1]

def min_distance(s1, s2):
    n1, n2 = len(s1), len(s2)
    dp = [[0]*(n2+1) for i in range(n1+1)]
    for i in range(1, n1+1):
        dp[i][0] = i
    for j in range(1, n2+1):
        dp[0][j] = j
    for i in range(1, n1+1):
        for j in range(1, n2+1):
            if s1[i-1] == s2[j-1]:
                dp[i][j] = dp[i-1][j-1]
            else:
                dp[i][j] = min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1]) + 1
    return dp[n1][n2]

单个字符串涉及回文问题,也是用二位dp数组。状态方程中dp[i][j]与dp[i+1][j-1],dp[i+1][j],dp[i][j-1]等可能有关系。

最长回文子序列
dp[i][j]表示s中从i位置到j位置最长回文子序列为dp[i][j],状态转移方程:
d p [ i ] [ j ] = { d p [ i + 1 ] [ j − 1 ] + 2 , s [ i ] = = s [ j ] m a x ( d p [ i + 1 ] [ j ] , d p [ i ] [ j − 1 ] ) , s [ i ] ! = s [ j ] dp[i][j] = \left\{\begin{matrix} dp[i+1][j-1] + 2, & s[i] == s[j]\\ max(dp[i+1][j], dp[i][j-1]), & s[i] != s[j] \end{matrix}\right. dp[i][j]={dp[i+1][j1]+2,max(dp[i+1][j],dp[i][j1]),s[i]==s[j]s[i]!=s[j]
由于dp[i][j]与dp[i+1][j-1]、dp[i+1][j]、dp[i][j-1]状态相关,所以需要反向遍历,先计算dp[i+1][j-1]、dp[i+1][j]、dp[i][j-1],然后计算dp[i][j]。

def longestPalindromeSubseq(s):
    n = len(s)
    dp = [[0]*n for i in range(n)]
    for i in range(n):
        dp[i][i] = 1
    for i in range(n-1, -1, -1):
        for j in range(i+1, n):
            if s[i] == s[j]:
                dp[i][j] = dp[i+1][j-1] + 2
            else:
                dp[i][j] = max(dp[i+1][j], dp[i][j-1])
    return dp[0][n-1]

最长回文子串
dp[i][j]表示s[i:j]是否为回文子串,左闭右闭区间,状态转移方程:
d p [ i ] [ j ] = { T r u e , s [ i ] = = s [ j ] , j − i < = 2 d p [ i + 1 ] [ j − 1 ] , s [ i ] = = s [ j ] , j − i > 2 F a l s e , s [ i ] ! = s [ j ] dp[i][j] = \left\{\begin{matrix} True, & s[i] == s[j], j - i <= 2\\ dp[i+1][j-1], & s[i] == s[j], j - i > 2 \\ False, & s[i] != s[j] \end{matrix}\right. dp[i][j]=True,dp[i+1][j1],False,s[i]==s[j],ji<=2s[i]==s[j],ji>2s[i]!=s[j]
由于dp[i][j]与dp[i+1][j-1]状态相关,所以需要反向遍历,先计算dp[i+1][j-1],然后计算dp[i][j]。

def longestPalindrome(s: str) -> str:
    n = len(s)
    dp = [[False]*n for i in range(n)]
    for i in range(n):
        dp[i][i] = True
    res, start = 1, 0
    for j in range(1, n):
        for i in range(j-1, -1, -1):
            if s[i] == s[j]:
                if j - i <= 2:
                    dp[i][j] = True
                else:
                    dp[i][j] = dp[i+1][j-1]
            if dp[i][j] and j - i + 1 > res:
                res = j - i + 1
                start = i
    return s[start: start+res]

参考文档
[1]. https://labuladong.gitbook.io/algo/dong-tai-gui-hua-xi-lie/zi-xu-lie-wen-ti-mo-ban
[2]. https://leetcode-cn.com/problems/longest-common-subsequence/solution/dong-tai-gui-hua-zhi-zui-chang-gong-gong-zi-xu-lie/

posted @ 2020-05-30 21:42  黄然小悟  阅读(83)  评论(0编辑  收藏  举报