动态规划总结2 股票买卖问题
在Leetcode
上有一系列经典的股票买卖问题,也是典型的动态规划问题.也可以利用在每一天的状态是否买卖来解决.然而每一天的买卖关系受到了前一天操作的逻辑上的约束,看起来更像是一个买卖之间状态的转移关系.
Leetcode 121. Best Time to Buy and Sell Stock
在最基础的版本里,给定每一天的股票价格,只允许买卖至多一次,寻求可以获得的最大利润.
Example
Input: prices = [7,1,5,3,6,4]
Output: 5
Explanation: Buy on day 2 (price = 1) and sell on day 5 (price = 6), profit = 6-1 = 5.
Note that buying on day 2 and selling on day 1 is not allowed because you must buy before you sell.
实际上这题如果不用动态规划的思想去做,也很好理解,只要找到基于当前这一天最小的买入值,用今天的卖出值减掉,获得的利润就是当天最大的,再进行比较就可以获得所有天中最大的.
class Solution(object):
def maxProfit(self, prices):
"""
:type prices: List[int]
:rtype: int
"""
min_value = float('inf')
res = 0
for i in range(len(prices)):
min_value = min(min_value, prices[i])
res = max(prices[i]-min_value, res)
return res
但是,如何使用动归来解决和描述呢?我们用下面的状态转移图来进行表达:
现在有两个状态A
和B
,显然A
状态只能通过B
状态卖出或者保持不买卖才能到达,同理,B
状态可以保持不变,但也可以通过A
状态买入才能到达.那么可以得出一个公式:
A[i] = max(A[i-1], B[i-1]+nums[i])
B[i] = max(B[i-1], A[i-1]-nums[i])
由于刚开始并没有办法卖出,所以A[0]=0
但是可以买入,因此B[0]=-nums[0]
A[i]
和B[i]
是第i
天可能出现的所有情况的值.因为要求获得的最大利润,那必然是处于A
状态下的结果.
本题中只能至多只能买卖一次,也就意味着状态应该是由B->A
或者A
保持不变两种情况:
B->A
,B
无法从A
中转移过来;B->B
, 保持与前一天一致
所以B
的式子变为:
B[i] = max(B[i-1], -nums[i])
可以看出来,它所代表的意思,也是到当天为止所卖出的最大值(此处是负数).那么结合A
的式子,代码就可以写成:
class Solution(object):
def maxProfit(self, prices):
"""
:type prices: List[int]
:rtype: int
"""
b = [float('-inf') for i in range(len(prices))]
b[0] = -prices[0]
for i in range(1, len(prices)):
b[i] = max(-prices[i], b[i-1])
res = 0
for i in range(1, len(prices)):
res = max(prices[i]+b[i-1], res)
return res
将一维数组再优化一下,就可以得出第一种直观的答案了:
class Solution(object):
def maxProfit(self, prices):
"""
:type prices: List[int]
:rtype: int
"""
b = -prices[0]
res = 0
for i in range(1, len(prices)):
b = max(b, -prices[i])
res = max(prices[i]+b, res)
return res
如果可以允许多次买卖的话,那么就可以直接利用上面的公式了.比如下面这题.
Leetcode 122. Best Time to Buy and Sell Stock II
Example:
Input: prices = [7,1,5,3,6,4]
Output: 7
Explanation: Buy on day 2 (price = 1) and sell on day 3 (price = 5), profit = 5-1 = 4.
Then buy on day 4 (price = 3) and sell on day 5 (price = 6), profit = 6-3 = 3.
解答如下,根据公式写的优化了一下:
class Solution(object):
def maxProfit(self, prices):
"""
:type prices: List[int]
:rtype: int
"""
# A = [0 for i in range(len(prices)]
# B = ['_' for j in range(len(prices)]
A = 0
B = -prices[0]
res = 0
for i in range(1, len(prices)):
prev_a, prev_b = A, B
# A[i] = max(A[i-1], B[i-1]+nums[i])
A = max(prev_a, prev_b+prices[i])
# B[i] = max(B[i-1], A[i-1]-nums[i])
B = max(prev_b, prev_a-prices[i])
# res = max(A[i], res)
res = max(A, res)
return res
购买流程如下图蓝色框所示:
利用状态转移图,可以解决股票买卖的各种变种.比如,交易需要费用:
Leetcode 714. Best Time to Buy and Sell Stock with Transaction Fee
只需要在原有的基础上扣除费用:
class Solution(object):
def maxProfit(self, prices, fee):
"""
:type prices: List[int]
:type fee: int
:rtype: int
"""
A = 0
B = -prices[0]
res = 0
for i in range(1, len(prices)):
prev_a = A
A = max(B+prices[i]-fee, A)
B = max(prev_a-prices[i], B)
res = max(res, A)
return res
又或者如下面的题目,限定交易的次数.
Leetcode 123. Best Time to Buy and Sell Stock III
本例中,至多只能进行2次股票买卖.
Example
Input: prices = [3,3,5,0,0,3,1,4]
Output: 6
Explanation: Buy on day 4 (price = 0) and sell on day 6 (price = 3), profit = 3-0 = 3.
Then buy on day 7 (price = 1) and sell on day 8 (price = 4), profit = 4-1 = 3.
事实上,通用公式可以总结如下:
假设截止到第i
天进行了k
次交易,定为dp[k,i]
,那么(忽略边界条件):
dp[k,j] = max(dp[k-1, j-1] + prices[i]-prices[j]+dp[k-1,j-1]) 其中`0<=j<=i`
参照大佬的解法:Detail explanation of DP solution
如果尝试使用状态转移图的话,也可以解答:
A
是最终能获得的利润,除了一次都不交易,它可以从两种状态转移而来:
# 只交易一次
A1 -> B1 -> A
# 交易两次
A1 -> B1 -> A2 -> B2 -> A
所以可以得出下面的状态转移式子(A1
是初始状态,为0):
B1[i] = max(-prices[i], B1[i-1])
A2[i] = max(A2[i-1], B1[i-1]+prices[i])
B2[i] = max(B2[i-1], A2[i-1]-prices[i])
A[i] = max(max(B2[i-1], B1[i-1])+prices[i], A[i-1])
只需要置换一下顺序,就可以不用数组以及中间变量:
class Solution(object):
def maxProfit(self, prices):
"""
:type prices: List[int]
:rtype: int
"""
# b1 = [float('-inf') for i in range(len(prices))]
# a2 = [float('-inf') for i in range(len(prices))]
# b2 = [float('-inf') for i in range(len(prices))]
b1 = -prices[0]
a2 = b2 = float('-inf')
# b1[0] = -prices[0]
res = 0
for i in range(1, len(prices)):
# b1[i] = max(-prices[i], b1[i-1])
# b2[i] = max(a2[i-1]-prices[i], b2[i-1])
# a2[i] = max(b1[i-1]+prices[i], a2[i-1])
# res = max(max(b1[i-1], b2[i-1]) + prices[i], res)
res = max(max(b1, b2)+prices[i], res)
b2 = max(b2, a2-prices[i])
a2 = max(a2, b1+prices[i])
b1 = max(-prices[i], b1)
return res
Leetcode 188. Best Time to Buy and Sell Stock IV
在上一题的基础上延伸一下,如果至多可以交易k
次,能获得的最大利润是多少呢?
再将状态转移图延伸一下,比如可以交易三次,那么在原有的基础上可以将公式变为:
B1[i] = max(-prices[i], B1[i-1])
A2[i] = max(A2[i-1], B1[i-1]+prices[i])
B2[i] = max(B2[i-1], A2[i-1]-prices[i])
A3[i] = max(A3[i-1], B2[i-1]+prices[i])
B3[i] = max(B3[i-1], A3[i-1]-prices[i])
A[i] = max(max(B2[i-1], B1[i-1], B3[i-1])+prices[i], A[i-1])
所以可以总结出一个规律:
A[k][i] = max(A[k-1][i-1], B[k-1][i-1]+prices[i])
B[k][i] = max(B[k-1][i-1], A[k][i-1]-prices[i])
i
这个维度可以去掉,所以只需要两个一维数组即可:
class Solution(object):
def maxProfit(self, k, prices):
"""
:type k: int
:type prices: List[int]
:rtype: int
"""
res = 0
if not prices or k <= 0:
return res
a = [float('-inf') for i in range(k)]
b = [float('-inf') for i in range(k)]
b[0] = -prices[0]
b_max = b[0]
for i in range(1, len(prices)):
for j in range(1, k):
b_max= max(b[j], b_max)
b[j] = max(a[j]-prices[i], b[j])
a[j] = max(b[j-1]+prices[i], a[j])
b[0] = max(-prices[i], b[0])
res = max(max(b[0],b_max)+prices[i], res)
return res
同样地,中间增加其他状态也可以利用状态图解决.
Leetcode 309. Best Time to Buy and Sell Stock with Cooldown
当卖出股票后,至少要休息一天才能进行下一次交易.
Example
Input: prices = [1,2,3,0,2]
Output: 3
Explanation: transactions = [buy, sell, cooldown, buy, sell]
这个时候从A
状态转移到B
后,B
无法再直接转到A
,而是需要休息,因此设定休息的状态为C
,状态转移图如下:
根据转移图又可以得到:
A[i] = max(A[i-1], C[i-1])
B[i] = max(A[i-1]-prices[i], B[i-1])
C[i] = max(B[i-1]+prices[i], C[i-1])
即除了保持自己的状态,不做任何操作外,A
状态只能由C
转移而来;B
状态只能由A
进行买入;C
状态只能从B
状态卖出得来.
同样地,第一天最多只能进行买入操作,因此B[0] = -prices[0]
,而A
持有的收益最大为0, C
在第一天不可能有收益,因此初始化负无穷(和前面题目A2
,A3
类似,可以理解为无效值用以占位).可以获得的最大收益只可能出现在A
或者C
上,使用数组和不是用数组的代码如下:
class Solution(object):
def maxProfit(self, prices):
"""
:type prices: List[int]
:rtype: int
"""
# A = [0 for i in range(len(prices))]
# B = [0 for i in range(len(prices))]
# C = [0 for i in range(len(prices))]
A = 0
C = float('-inf')
B = -prices[0]
# B[0] = -prices[0]
res = 0
for i in range(1, len(prices)):
# B[i] = max(A[i-1]-prices[i], B[i-1])
# A[i] = max(C[i-1], A[i-1])
# C[i] = max(B[i-1]+prices[i], C[i-1])
prev_b = B
B = max(A-prices[i], B)
A = max(C, A)
C = max(prev_b+prices[i], C)
res = max(C, A, res)
return res
至此,就解决了所有的股票买卖问题.