动态规划中的子序列子串相关问题
涉及到求子序列或子串,以及编辑距离等问题,可以用动态规划解决。下面详细探讨一下这个问题。
子序列子串问题分类
这类问题分主要分三大类
- 连续子数组或子串:最大连续子序和、最长回文子串
- 不连续的子序列或子串:最长递增子序列、最长回文子序列
- 涉及两个字符串/数组:最长公共子序列、最小编辑距离
一般可以用动态规划中的一维dp数组和二位dp数组解决。
一维dp数组
状态方程中,dp[i]的结果只与dp[i-1]相关,或者与dp[0]到dp[i-1]都相关。
最大连续子序和
dp[i]表示以nums[i]为结尾最大子数组和,状态转移方程为:
d p [ i ] = m a x ( d p [ i − 1 ] + n u m s [ i ] , n u m s [ i ] ) dp[i] = max(dp[i-1]+nums[i], nums[i]) dp[i]=max(dp[i−1]+nums[i],nums[i])
对于最大连续子序和这个问题,最终的结果还要求dp数组的最大值。
def maxSubArray(nums):
"""
最大连续子序和
"""
n = len(nums)
if n == 0: return 0
dp = [0] * n
dp[0] = nums[0]
for i in range(1, n):
dp[i] = max(dp[i-1]+nums[i], nums[i])
res = max(dp)
return res
最长递增子序列
dp[i] 表示以nums[i]这个数结尾的最长递增子序列的长度,状态转移方程为:
d
p
[
i
]
=
{
m
a
x
(
d
p
[
j
]
)
,
n
u
m
s
[
i
]
<
=
n
u
m
s
[
j
]
,
0
<
j
<
i
m
a
x
(
d
p
[
j
]
)
+
1
n
u
m
s
[
i
]
>
n
u
m
s
[
j
]
,
0
<
j
<
i
dp[i] = \left\{\begin{matrix} max(dp[j]), & nums[i]<=nums[j], 0<j<i\\ max(dp[j])+1 & nums[i]>nums[j], 0<j<i \end{matrix}\right.
dp[i]={max(dp[j]),max(dp[j])+1nums[i]<=nums[j],0<j<inums[i]>nums[j],0<j<i
与上面的类似,最终的结果还要求dp数组的最大值。
def lengthOfLIS(nums):
"""
最长递增子序列
dp[i]表示以nums[i]这个数结尾的最长递增子序列的长度
"""
dp = [1] * len(nums)
for i in range(1, len(nums)):
for j in range(0, i):
if nums[i] > nums[j]:
dp[i] = max(dp[i], dp[j]+1)
res = max(dp)
return res
二维dp数组
涉及到两个字符串或数组时,一般都是用二位dp数组。状态方程中dp[i][j]的结果与dp[i-1][j-1],dp[i-1][j],dp[i][j-1]等都可能有关系。
最长公共子序列
dp[i][j]表示s1[0:i]与s2[0:j]的最长公共子序列,左闭右开区间,状态转移方程:
d
p
[
i
]
[
j
]
=
{
d
p
[
i
−
1
]
[
j
−
1
]
+
1
,
s
1
[
i
]
=
=
s
2
[
j
]
m
a
x
(
d
p
[
i
−
1
]
[
j
]
,
d
[
i
]
[
j
−
1
]
)
,
s
1
[
i
]
!
=
s
2
[
j
]
dp[i][j] = \left\{\begin{matrix} dp[i-1][j-1]+1, & s1[i] == s2[j]\\ max(dp[i-1][j],d[i][j-1]), & s1[i] != s2[j] \end{matrix}\right.
dp[i][j]={dp[i−1][j−1]+1,max(dp[i−1][j],d[i][j−1]),s1[i]==s2[j]s1[i]!=s2[j]
def longestCommonSubsequence(str1, str2) -> int:
m, n = len(str1), len(str2)
dp = [[0]*(n+1) for i in range(m+1)]
for i in range(1, m+1):
for j in range(1, n+1):
if str1[i-1] == str2[j-1]:
dp[i][j] = dp[i-1][j-1] + 1
else:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
return dp[m][n]
最小编辑距离
dp[i][j]表示s1[0:i]和s2[0:j]的最小编辑距离,左闭右开区间,状态转移方程:
d
p
[
i
]
[
j
]
=
{
d
p
[
i
−
1
]
[
j
−
1
]
,
s
1
[
i
−
1
]
=
=
s
2
[
j
−
1
]
m
i
n
(
d
p
[
i
−
1
]
[
j
]
,
d
p
[
i
]
[
j
−
1
]
,
d
p
[
i
−
1
]
[
j
−
1
]
)
+
1
,
s
1
[
i
−
1
]
!
=
s
2
[
j
−
1
]
dp[i][j] = \left\{\begin{matrix} dp[i-1][j-1], & s1[i-1] == s2[j-1]\\ min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1]) + 1, & s1[i-1] != s2[j-1] \end{matrix}\right.
dp[i][j]={dp[i−1][j−1],min(dp[i−1][j],dp[i][j−1],dp[i−1][j−1])+1,s1[i−1]==s2[j−1]s1[i−1]!=s2[j−1]
def min_distance(s1, s2):
n1, n2 = len(s1), len(s2)
dp = [[0]*(n2+1) for i in range(n1+1)]
for i in range(1, n1+1):
dp[i][0] = i
for j in range(1, n2+1):
dp[0][j] = j
for i in range(1, n1+1):
for j in range(1, n2+1):
if s1[i-1] == s2[j-1]:
dp[i][j] = dp[i-1][j-1]
else:
dp[i][j] = min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1]) + 1
return dp[n1][n2]
单个字符串涉及回文问题,也是用二位dp数组。状态方程中dp[i][j]与dp[i+1][j-1],dp[i+1][j],dp[i][j-1]等可能有关系。
最长回文子序列
dp[i][j]表示s中从i位置到j位置最长回文子序列为dp[i][j],状态转移方程:
d
p
[
i
]
[
j
]
=
{
d
p
[
i
+
1
]
[
j
−
1
]
+
2
,
s
[
i
]
=
=
s
[
j
]
m
a
x
(
d
p
[
i
+
1
]
[
j
]
,
d
p
[
i
]
[
j
−
1
]
)
,
s
[
i
]
!
=
s
[
j
]
dp[i][j] = \left\{\begin{matrix} dp[i+1][j-1] + 2, & s[i] == s[j]\\ max(dp[i+1][j], dp[i][j-1]), & s[i] != s[j] \end{matrix}\right.
dp[i][j]={dp[i+1][j−1]+2,max(dp[i+1][j],dp[i][j−1]),s[i]==s[j]s[i]!=s[j]
由于dp[i][j]与dp[i+1][j-1]、dp[i+1][j]、dp[i][j-1]状态相关,所以需要反向遍历,先计算dp[i+1][j-1]、dp[i+1][j]、dp[i][j-1],然后计算dp[i][j]。
def longestPalindromeSubseq(s):
n = len(s)
dp = [[0]*n for i in range(n)]
for i in range(n):
dp[i][i] = 1
for i in range(n-1, -1, -1):
for j in range(i+1, n):
if s[i] == s[j]:
dp[i][j] = dp[i+1][j-1] + 2
else:
dp[i][j] = max(dp[i+1][j], dp[i][j-1])
return dp[0][n-1]
最长回文子串
dp[i][j]表示s[i:j]是否为回文子串,左闭右闭区间,状态转移方程:
d
p
[
i
]
[
j
]
=
{
T
r
u
e
,
s
[
i
]
=
=
s
[
j
]
,
j
−
i
<
=
2
d
p
[
i
+
1
]
[
j
−
1
]
,
s
[
i
]
=
=
s
[
j
]
,
j
−
i
>
2
F
a
l
s
e
,
s
[
i
]
!
=
s
[
j
]
dp[i][j] = \left\{\begin{matrix} True, & s[i] == s[j], j - i <= 2\\ dp[i+1][j-1], & s[i] == s[j], j - i > 2 \\ False, & s[i] != s[j] \end{matrix}\right.
dp[i][j]=⎩⎨⎧True,dp[i+1][j−1],False,s[i]==s[j],j−i<=2s[i]==s[j],j−i>2s[i]!=s[j]
由于dp[i][j]与dp[i+1][j-1]状态相关,所以需要反向遍历,先计算dp[i+1][j-1],然后计算dp[i][j]。
def longestPalindrome(s: str) -> str:
n = len(s)
dp = [[False]*n for i in range(n)]
for i in range(n):
dp[i][i] = True
res, start = 1, 0
for j in range(1, n):
for i in range(j-1, -1, -1):
if s[i] == s[j]:
if j - i <= 2:
dp[i][j] = True
else:
dp[i][j] = dp[i+1][j-1]
if dp[i][j] and j - i + 1 > res:
res = j - i + 1
start = i
return s[start: start+res]
参考文档
[1]. https://labuladong.gitbook.io/algo/dong-tai-gui-hua-xi-lie/zi-xu-lie-wen-ti-mo-ban
[2]. https://leetcode-cn.com/problems/longest-common-subsequence/solution/dong-tai-gui-hua-zhi-zui-chang-gong-gong-zi-xu-lie/