Dynamic Programming | Set 3 (Longest Increasing Subsequence)
在 Dynamic Programming | Set 1 (Overlapping Subproblems Property) 和 Dynamic Programming | Set 2 (Optimal Substructure Property) 中我们已经讨论了重叠子问题和最优子结构性质,现在我们来看一个可以使用动态规划来解决的问题:最长上升子序列(Longest Increasing Subsequence(LIS))。
最长上升子序列问题,致力于在一个给定的序列中找到一个最长的子序列,该子序列中的元素按升序排列。例如,序列{10, 22, 9, 33, 21, 50, 41, 60, 80}的最长上升子序列的长度为6,最长上升子序列为{10, 22, 33, 50, 60, 80}。
Optimal Substructure:
假设arr[0..n-1]为输入数组,L(i)是以下标i结束的数组的最长上升子序列的长度,满足arr[i]是LIS的一部分,即arr[i]是该LIS中的最后一个元素,那么L(i)可以递归的表示为:
L(i) = { 1 + Max ( L(j) ) } where j < i and arr[j] < arr[i] and if there is no such j then L(i) = 1
要获得一个给定数组LIS的长度,我们需要返回 max(L(i)) where 0 < i < n。
因此,LIS问题具有最优子结构性质,因此该问题可以使用子问题的方法来求解。
Overlapping Subproblems:
以下是LIS问题的一个简单递归版本程序。
/* A Naive recursive implementation of LIS problem */ #include<stdio.h> #include<stdlib.h> /* To make use of recursive calls, this function must return two things: 1) Length of LIS ending with element arr[n-1]. We use max_ending_here for this purpose 2) Overall maximum as the LIS may end with an element before arr[n-1] max_ref is used this purpose. The value of LIS of full array of size n is stored in *max_ref which is our final result */ int _lis( int arr[], int n, int *max_ref) { /* Base case */ if(n == 1) return 1; int res, max_ending_here = 1; // length of LIS ending with arr[n-1] /* Recursively get all LIS ending with arr[0], arr[1] ... ar[n-2]. If arr[i-1] is smaller than arr[n-1], and max ending with arr[n-1] needs to be updated, then update it */ for(int i = 1; i < n; i++) { res = _lis(arr, i, max_ref); if (arr[i-1] < arr[n-1] && res + 1 > max_ending_here) max_ending_here = res + 1; } // Compare max_ending_here with the overall max. And update the // overall max if needed if (*max_ref < max_ending_here) *max_ref = max_ending_here; // Return length of LIS ending with arr[n-1] return max_ending_here; } // The wrapper function for _lis() int lis(int arr[], int n) { // The max variable holds the result int max = 1; // The function _lis() stores its result in max _lis( arr, n, &max ); // returns max return max; } /* Driver program to test above function */ int main() { int arr[] = { 10, 22, 9, 33, 21, 50, 41, 60 }; int n = sizeof(arr)/sizeof(arr[0]); printf("Length of LIS is %d\n", lis( arr, n )); getchar(); return 0; }
考虑以上实现,如下是当数组大小为4时的递归树,lis(n)为以n为最后一个元素时,数组的LIS的长度。
不难发现,其中有子问题被重复计算。因此,该问题具有重叠子结构性质,通过Memoization或者Tabulation,可以防止子问题的重复计算。如下,是LIS问题的tabluated实现。
/* Dynamic Programming implementation of LIS problem */ #include<stdio.h> #include<stdlib.h> /* lis() returns the length of the longest increasing subsequence in arr[] of size n */ int lis( int arr[], int n ) { int *lis, i, j, max = 0; lis = (int*) malloc ( sizeof( int ) * n ); /* Initialize LIS values for all indexes */ for ( i = 0; i < n; i++ ) lis[i] = 1; /* Compute optimized LIS values in bottom up manner */ for ( i = 1; i < n; i++ ) for ( j = 0; j < i; j++ ) if ( arr[i] > arr[j] && lis[i] < lis[j] + 1) lis[i] = lis[j] + 1; /* Pick maximum of all LIS values */ for ( i = 0; i < n; i++ ) if ( max < lis[i] ) max = lis[i]; /* Free memory to avoid memory leak */ free( lis ); return max; } /* Driver program to test above function */ int main() { int arr[] = { 10, 22, 9, 33, 21, 50, 41, 60 }; int n = sizeof(arr)/sizeof(arr[0]); printf("Length of LIS is %d\n", lis( arr, n ) ); getchar(); return 0; }
注意,以上的动态规划解法需要的时间复杂度为O(n^2),实际上LIS问题有O(nLogn)的解法(see this)。在这边,我们并没有讨论O(nLogn)的解法,此处,只是用这篇文章来作为动态规划的一个简单例子。
补充一个最笨的方法:将所有的子序列使用dfs枚举出来,看其最大长度是多少,代码如下:
#include <iostream> #include <vector> using namespace std; void dfs(const vector<int> &input, vector<vector<int> > &ret, vector<int> &path, int pos) { if (pos == input.size()) { ret.push_back(path); return; } for (int i = 0; i != 2; i++) { if (i == 0) { path.push_back(input[pos]); dfs(input, ret, path, pos + 1); path.pop_back(); } else { dfs(input, ret, path, pos + 1); } } } int main() { vector<int> input = {1,2,3}; cout << "input.size() = " << input.size() << endl; vector<vector<int>> ret; vector<int> path; dfs(input, ret, path, 0); for (vector<vector<int> >::const_iterator itr = ret.begin(); itr != ret.end(); itr++) { cout << "--" << " "; for (vector<int>::const_iterator it = itr->begin(); it != itr->end(); it++) { cout << *it << " "; } cout << endl; } } /* Output: -- 1 2 3 -- 1 2 -- 1 3 -- 1 -- 2 3 -- 2 -- 3 -- */