计算机学院大学生程序设计竞赛(2015’12)Study Words
Study Words
Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)Total Submission(s): 195 Accepted Submission(s): 66
Problem Description
Learning English is not easy, vocabulary troubles me a lot.
One day an idea came up to me: I download an article every day, choose the 10 most popular new words to study.
A word's popularity is calculated by the number of its occurrences.
Sometimes two or more words have the same occurrences, and then the word with a smaller lexicographic has a higher popularity.
One day an idea came up to me: I download an article every day, choose the 10 most popular new words to study.
A word's popularity is calculated by the number of its occurrences.
Sometimes two or more words have the same occurrences, and then the word with a smaller lexicographic has a higher popularity.
Input
T in the first line is case number.
Each case has two parts.
<oldwords>
...
</oldwords>
<article>
...
</article>
Between <oldwords> and </oldwords> are some old words (no more than 10000) I have already learned, that is, I don't need to learn them any more.
Words between <oldwords> and </oldwords> contain letters ('a'~'z','A'~'Z') only, separated by blank characters (' ','\n' or '\t').
Between <article> and </article> is an article (contains fewer than 1000000 characters).
Only continuous letters ('a'~'z','A'~'Z') make up a word. Thus words like "don't" are regarded as two words "don" and "t”, that's OK.
Treat the uppercase as lowercase, so "Thanks" equals to "thanks". No words will be longer than 100.
As the article is downloaded from the internet, it may contain some Chinese words, which I don't need to study.
Each case has two parts.
<oldwords>
...
</oldwords>
<article>
...
</article>
Between <oldwords> and </oldwords> are some old words (no more than 10000) I have already learned, that is, I don't need to learn them any more.
Words between <oldwords> and </oldwords> contain letters ('a'~'z','A'~'Z') only, separated by blank characters (' ','\n' or '\t').
Between <article> and </article> is an article (contains fewer than 1000000 characters).
Only continuous letters ('a'~'z','A'~'Z') make up a word. Thus words like "don't" are regarded as two words "don" and "t”, that's OK.
Treat the uppercase as lowercase, so "Thanks" equals to "thanks". No words will be longer than 100.
As the article is downloaded from the internet, it may contain some Chinese words, which I don't need to study.
Output
For each case, output the top 10 new words I should study, one in a line.
If there are fewer than 10 new words, output all of them.
Output a blank line after each case.
If there are fewer than 10 new words, output all of them.
Output a blank line after each case.
Sample Input
2 <oldwords> how aRe you </oldwords> <article> --How old are you? --Twenty. </article> <oldwords> google cn huluobo net i </oldwords> <article> 文章内容: I love google,dropbox,firefox very much. Everyday I open my computer , open firefox , and enjoy surfing on the inter- net. But these days it's strange that searching "huluobo" is unavail- able. What's wrong with "huluobo"? </article>
Sample Output
old twenty firefox open s able and but computer days dropbox enjoy
#include<cstdio> #include<cstring> #include<map> #include<string> #include<algorithm> using namespace std; int T; char s[100+10]; char r[100+10]; map<string,int>m; struct dan { char s[100+10]; int num; }d[1000000+10]; int sum; int tot; bool cmp(const dan&a,const dan&b) { if(a.num==b.num) return strcmp(a.s,b.s)<0; return a.num>b.num; } //转小写 void F() { for(int i=0;s[i];i++) if(s[i]>='A'&&s[i]<='Z') s[i]=s[i]-'A'+'a'; } void work() { int len=strlen(s); tot=0; for(int i=0;i<=len;i++) { if(s[i]>='a'&&s[i]<='z') r[tot++]=s[i]; else { r[tot]='\0'; if(strlen(r)>0) m[r]=-1; tot=0; } } } void work2() { int len=strlen(s); tot=0; for(int i=0;i<=len;i++) { if(s[i]>='a'&&s[i]<='z') r[tot++]=s[i]; else { r[tot]='\0'; if(strlen(r)>0) { if(m[r]!=-1) { if(m[r]==0) strcpy(d[sum++].s,r); m[r]++; } } tot=0; } } } int main() { scanf("%d",&T); while(T--) { int flag=0; m.clear(); sum=0; while(1) { scanf("%s",s); if(strcmp(s,"<oldwords>")==0) {flag=1;continue;} if(strcmp(s,"</oldwords>")==0) {flag=2;continue;} if(strcmp(s,"<article>")==0) {flag=3;continue;} if(strcmp(s,"</article>")==0) break; if(flag==1) { F(); work(); } if(flag==3) { F(); work2(); } } for(int i=0;i<sum;i++) d[i].num=m[d[i].s]; sort(d,d+sum,cmp); for(int i=0;i<min(10,sum);i++) printf("%s\n",d[i].s); printf("\n"); } return 0; }
------------------- 这是千千的个人网站哦! https://www.dreamwings.cn -------------------