HDU-1686-Oulipo KMP

Oulipo

                                                                                                            Time Limit: 3000/1000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Others)
                                                                                                                                   Total Submission(s): 1245    Accepted Submission(s): 481


Problem Description

The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e'. He was a member of the Oulipo group. A quote from the book:

Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…

Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T's is not unusual. And they never use spaces.

So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {'A', 'B', 'C', …, 'Z'} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.

 

Input

The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:

One line with the word W, a string over {'A', 'B', 'C', …, 'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
One line with the text T, a string over {'A', 'B', 'C', …, 'Z'}, with |W| ≤ |T| ≤ 1,000,000.
 

Output

For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.

 

Sample Input
3
BAPC
BAPC
AZA
AZAZAZA
VERDI
AVERDXIVYERDIAN
 

Sample Output
1
3
0
   该题与HDU-2087-剪花布条的差别在于该题是要求母串中最多出现多少次,而且字符是可以重复利用的,通过第二个字符就能看出来对吧。那么这里总结一下这类题目的解法。   
  普通的KMP算法:题目一般要求输出匹配满足的最小值,即一匹配完成就跳出。代码是 if( k<= lo&& j<= ls )。
  剪布条:每次匹配成功,不及时跳出,而是使得j= 0,再继续匹配 代码是 if( k<= lo )。
  该题:每次匹配成功,不及时跳出,而是使得j= next[j],即假设单前匹配不成功,使得已经匹配的部分尽量“重复利用”。
  代码如下:
#include <stdio.h>
#include <string.h>

char o[1000005], s[10005];

int next[10005], ls, lo;

void getnext( char *s, int *next )
{
    int k= 1, j= 0;
    while( k< ls )
    {
        if( j== 0|| s[k]== s[j] )
        {
            ++j, ++k;
            next[k]= j;
        }
        else
        {
            j= next[j];
        }
    }
}

bool getstr( char *str )
{
    char c;    /* */    int i= 0;
    while( c= getchar(  ) )
    {
        if( c== 10 )
        {
            str[i++]= '\0';
            break;
        }
        else if( c>= 'A'&& c<= 'Z' )
        {
            str[i++]= c;
        }
        else
        {
            return false;
        }
    }
    return true;
}

int kmp( char *o, char *s, int *next )
{
    int k= 0, j= 0, ans= 0;
    while( k<= lo )
    {
        if( j== 0|| o[k]== s[j] )
        {
            if( j== ls )
            {
                ans++;
                j= next[j];
                continue;
            }
            ++j, ++k; 
        }
        else
        {
            j= next[j];
        }
    }
    return ans;
}

int main(  )
{
    int T;
    scanf( "%d", &T );
    getchar(  );
    while( T-- )
    {
        getstr( s+ 1 );
        getstr( o+ 1 );
        ls= strlen( s+ 1 );
        lo= strlen( o+ 1 );
        getnext( s, next ); 
        printf( "%d\n", kmp( o, s, next ) );     
    }    
    return 0;
}

posted @ 2011-07-26 09:13  沐阳  阅读(354)  评论(0编辑  收藏  举报