HDU 5672 String 尺取法追赶法
String
Problem Description
There is a string S.S only contain lower case English character.(10≤length(S)≤1,000,000)
How many substrings there are that contain at least k(1≤k≤26) distinct characters?
How many substrings there are that contain at least k(1≤k≤26) distinct characters?
Input
There are multiple test cases. The first line of input contains an integer T(1≤T≤10) indicating the number of test cases. For each test case:
The first line contains string S.
The second line contains a integer k(1≤k≤26).
The first line contains string S.
The second line contains a integer k(1≤k≤26).
Output
For each test case, output the number of substrings that contain at least k dictinct characters.
Sample Input
2
abcabcabca
4
abcabcabcabc
3
Sample Output
0
55
题意:
有一个 10\leq10≤长度\leq 1,000,000≤1,000,000 的字符串,仅由小写字母构成。求有多少个子串,包含有至少k(1 \leq k \leq 26)k(1≤k≤26)个不同的字母?
题解:
有一个明显的性质:如果子串(i,j)(i,j)包含了至少kk个不同的字符,那么子串(i,k),(j < k < length)(i,k),(j<k<length)也包含了至少kk个不同字符。
因此对于每一个左边界,只要找到最小的满足条件的右边界,就能在O(1)O(1)时间内统计完所有以这个左边界开始的符合条件的子串。
寻找这个右边界,是经典的追赶法(尺取法,双指针法)问题。维护两个指针(数组下标),轮流更新左右边界,同时累加答案即可。复杂度 O(length(S))O(length(S))。
#include<iostream> #include<cstdio> #include<cmath> #include<string> #include<queue> #include<algorithm> #include<stack> #include<cstring> #include<vector> #include<list> #include<set> #include<map> #pragma comment(linker, "/STACK:102400000,102400000") using namespace std; const int N = 1e6+10, M = 30005, mod = 1e9 + 7, inf = 0x3f3f3f3f; typedef long long ll; int T,k,H[300]; char s[N]; int main() { scanf("%d",&T); while(T--) { scanf("%s%d",s+1,&k); int n = strlen(s+1); memset(H,0,sizeof(H)); int r = 0,cnt = 0; ll ans = 0 ; for(int l = 1;l<=n;l++) { while(cnt<k&&r<n) { ++r; if(++H[s[r]]==1) cnt++; } if(cnt<k) break; ans = ans+n-r+1; if(--H[s[l]] == 0) --cnt; } cout<<ans<<endl; } return 0; }