找常用词(字符串处理)问题

问题：输入规格：
每个输入文件包含一个测试用例。对于每一种情况下，有一行文字不超过长度1048576个字符的，由回车'\ N'终止。输入中包含的至少一个字母数字字符，即，从集合[0-9 AZ AZ]一个字符。
输出规格：
对于每一个测试的情况下，打印在一行中的输入文本最常发生的词，后跟一个空格和的时候，它发生在输入的数目。如果有多个这样的话，打印字典序最小的一个。这个词应该被印在全部小写。这里一个“字”的定义是由非字母数字字符或行开始/结束分离字母数字字符的连续序列。
写出算法(需要注意的是词不区分大小写)。

Sample Input:

Can1: "Can a can can a can? It can!"

Sample Output:

can 5

回答：

#include"iostream"
#include "string"
#include"map"
#include"algorithm"
using namespace std;
#define N 1000

bool isalp(char a)
{
    if(a>='a'&&a<='z'||a>='A'&&a<='Z'||a>='0'&&a<='9')
        return true;
    return false;
}

int main()
{
    string str,word;
    map<string,int> Count;//notice it's better not substitute 'string' to 'char*', for char* stores the address while string stores object
    map<string,int>::iterator it,tmp;
    while(getline(cin,str))
    {
        transform(str.begin(),str.end(),str.begin(),::tolower);
        //cout<<str<<endl;
        int i=0,j;
        while(i<str.length())
        {
            j = i;
            while(!isalp(str[j])&&j<str.length())j++;//skip non-alphanumerical character
            i = j;
            while(isalp(str[j])&&i<str.length())j++; //i is the start and j is the end point
            if(i!=j)
            {
                string word=str.substr(i,j-i);//notice the usage of substr: substr(start,length)
                if(Count.find(word)==Count.end())
                    Count[word] = 1;
                else
                    Count[word]++;
                i=j;
            }
        }
        int minn = -1;

        for(it = Count.begin();it!=Count.end();it++)
            if(it->second>minn)
            {
                minn = it->second;
                tmp = it;
            }
            cout<<tmp->first<<" "<<tmp->second<<endl;
    }
    return 0;
}

posted @ 2015-05-10 23:11 chaoer 阅读(240) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

chaoer

找常用词(字符串处理)问题

公告