Leetcode | Minimum Window Substring

Given a string S and a string T, find the minimum window in S which will contain all the characters in T in complexity O(n).

For example,
S = "ADOBECODEBANC"
T = "ABC"
Minimum window is "BANC".

Note:
If there is no such window in S that covers all characters in T, return the emtpy string "".

If there are multiple such windows, you are guaranteed that there will always be only one unique minimum window in S.

碰到这种最小区间,或者子段的题,我总是要和最长不减子序列、最大连续区间这些想在一起,走进死胡同。

关于题意:

1. 这里不需要去重,leetcode的test case要求如果T中有重复字符,那么区间内也要相应有有同样数目的重复字符;

参考了网上水中的鱼的解法,就是维护一个区间,然后尝试去缩小,最终得到最小区间。可以这样理解:

对于位置i,找到最小的区间[start, i]包括T的所有字符。最后对所有的区间取最小。

对于查找某个字符存在,这里用了计数的技巧。就是维护一个数组(256大小)记录T中每个字符出现的次数;维护另一个数组,记录当前区间中,每个字符出现的次数;

1. 每次要统计出现的T中的字符,只有第一次出现才需要计数;(Line 15);

2. 缩小当前区间[start, i];(Line 19-21);

3. 保存最小区间,起始位置和长度都要记录;(Line 22-25);

4. 如果没有覆盖T的所有字符,返回空。 (Line 29);

 1 class Solution {
 2 public:
 3     string minWindow(string S, string T) {
 4         if (T.empty() || S.empty()) return "";
 5         int n1 = S.length(), n2 = T.length();
 6         int start = 0, min = n1 + 1, minStart = 0;
 7         int charsOfT[256] = {0};
 8         int appearChars[256] = {0}, appeared = 0;
 9         for (int i = 0;  i < n2; ++i) {
10             charsOfT[T[i]]++;
11         }
12         
13         for (int i = 0;  i < n1; ++i) {
14             if (charsOfT[S[i]] > 0) {
15                 if (appearChars[S[i]] < charsOfT[S[i]]) appeared++;
16                 appearChars[S[i]]++;
17             }
18             if (appeared == n2) {
19                 for (; start <= i && (charsOfT[S[start]] == 0 || appearChars[S[start]] > charsOfT[S[start]]); ++start) {
20                     appearChars[S[start]]--;
21                 }
22                 if (i - start + 1 < min) {
23                     min = i - start + 1;
24                     minStart = start;
25                 }
26             }
27         }
28 
29         if (appeared < n2) return "";
30         else return S.substr(minStart, min);
31     }
32 };

最坏情况是接近扫描S两遍,所以整个时间复杂度还是O(n)。

事后诸葛:

如果要同时考虑区间覆盖的T的数目,还有区间的长度的话,整个问题就会非常复杂;不如假设某个区间覆盖了T所有的元素,那么就剩下求最小区间长度了。

于是很自然地可以想到,到了位置i,我找到了最小的区间[start, i]。这是[0,i]区间的一个代表;把所有的代表都找了出来,求最小,就是结果了。

第三次刷,写得复杂了。最后一个case卡了好久,后来发现是freqs和appeared数组定义成char[]了,改成int[]就行了。 看来是char溢出了。

 1 class Solution {
 2 public:
 3     string minWindow(string S, string T) {
 4         if (T.empty() || S.empty()) return "";
 5         int minIndex = -1, minLen = S.length() + 1, n1 = S.length(), n2 = T.length(), start = 0, next = 0, charsScanned = 0;
 6         int freqs[256], appeared[256];
 7         memset(freqs, 0, 256 * sizeof(int));
 8         memset(appeared, 0, 256 * sizeof(int));
 9         for (int i = 0; i < n2; ++i) {
10             freqs[T[i]]++;
11         }
12         
13         while (start < n1) {
14             for (; next < n1 && charsScanned < n2; ++next) { // find region that can cover all the letters in T
15                 if (freqs[S[next]] == 0) continue;
16                 appeared[S[next]]++;
17                 if (appeared[S[next]] <= freqs[S[next]]) charsScanned++;
18             }
19             if (charsScanned < n2) break; // no longer get a new matched region
20             
21             for (; start < next; start++) { // minimize the region
22                  if (freqs[S[start]] == 0) continue;
23                  if (appeared[S[start]] <= freqs[S[start]]) break;
24                  appeared[S[start]]--;
25             }
26             if (next - start < minLen) {
27                 minIndex = start;
28                 minLen = next - start;
29             }
30             appeared[S[start++]]--;
31             charsScanned--;
32         }
33         
34         if (minIndex == -1) return "";
35         else return S.substr(minIndex, minLen);
36     }
37 };

 

posted @ 2014-05-11 22:48  linyx  阅读(439)  评论(0编辑  收藏  举报