hdu 1088 HTML解析
Write a simple HTML Browser
Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/32768 K (Java/Others)
Total Submission(s): 6598 Accepted Submission(s): 1753
Problem Description
If you ever tried to read a html document on a Macintosh, you know how hard it is if no Netscape is installed.
Now, who can forget to install a HTML browser? This is very easy because most of the times you don't need one on a MAC because there is a Acrobate Reader which is native to MAC. But if you ever need one, what do you do?
Your task is to write a small html-browser. It should only display the content of the input-file and knows only the html commands (tags) <br> which is a linebreak and <hr> which is a horizontal ruler. Then you should treat all tabulators, spaces and newlines as one space and display the resulting text with no more than 80 characters on a line.
Now, who can forget to install a HTML browser? This is very easy because most of the times you don't need one on a MAC because there is a Acrobate Reader which is native to MAC. But if you ever need one, what do you do?
Your task is to write a small html-browser. It should only display the content of the input-file and knows only the html commands (tags) <br> which is a linebreak and <hr> which is a horizontal ruler. Then you should treat all tabulators, spaces and newlines as one space and display the resulting text with no more than 80 characters on a line.
Input
The input consists of a text you should display. This text consists of words and HTML tags separated by one or more spaces, tabulators or newlines.
A word is a sequence of letters, numbers and punctuation. For example, "abc,123" is one word, but "abc, 123" are two words, namely "abc," and "123". A word is always shorter than 81 characters and does not contain any '<' or '>'. All HTML tags are either <br> or <hr>.
A word is a sequence of letters, numbers and punctuation. For example, "abc,123" is one word, but "abc, 123" are two words, namely "abc," and "123". A word is always shorter than 81 characters and does not contain any '<' or '>'. All HTML tags are either <br> or <hr>.
Output
You should display the the resulting text using this rules:
. If you read a word in the input and the resulting line does not get longer than 80 chars, print it, else print it on a new line.
. If you read a <br> in the input, start a new line.
. If you read a <hr> in the input, start a new line unless you already are at the beginning of a line, display 80 characters of '-' and start a new line (again).
The last line is ended by a newline character.
. If you read a word in the input and the resulting line does not get longer than 80 chars, print it, else print it on a new line.
. If you read a <br> in the input, start a new line.
. If you read a <hr> in the input, start a new line unless you already are at the beginning of a line, display 80 characters of '-' and start a new line (again).
The last line is ended by a newline character.
Sample Input
Hallo, dies ist eine ziemlich lange Zeile, die in Html aber nicht umgebrochen wird. <br> Zwei <br> <br> produzieren zwei Newlines. Es gibt auch noch das tag <hr> was einen Trenner darstellt. Zwei <hr> <hr> produzieren zwei Horizontal Rulers. Achtung mehrere Leerzeichen irritieren Html genauso wenig wie mehrere Leerzeilen.
Sample Output
Hallo, dies ist eine ziemlich lange Zeile, die in Html aber nicht umgebrochen wird. Zwei produzieren zwei Newlines. Es gibt auch noch das tag -------------------------------------------------------------------------------- was einen Trenner darstellt. Zwei -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- produzieren zwei Horizontal Rulers. Achtung mehrere Leerzeichen irritieren Html genauso wenig wie mehrere Leerzeilen.
——————————————————————————————————————
题目 大意,就是做一个html解析的浏览器。其实就是字符串处理。规则如下:
- 每行最多80个字符。超过则换行。另外注意,每个单词不会被截断。就是说如果加上一个单词会超过80字符的话,直接换行,从下一行输出这个单词,而不是截断成两行。
- 遇到<br>转化成换行符
- 遇到<hr>换行后输出80个字符‘-’。
开始我想用java做,但后来发现有些画蛇添足。后来用c++做的。细节比较多了。
#include <iostream> #include <string> using namespace std; int main() { string s; string f (80,'-'); int count = 0; while(cin>>s) { if(s=="<br>") { cout<<endl; count=0; } else if(s=="<hr>") { if(count) cout<<endl;//如果这一行有单词要换行再打印80个’-‘。如果没有单词就直接打印 cout<<f<<endl; count=0; } else { if(count+s.length()+1>80)//+1的目的是有空格做分割 { cout<<endl<<s; count = s.length(); } else { if(!count) cout<<s; //如果这某一行的第一个单词,则不输出前导空格 else { cout<<" "+s;//如果不是第一个单词,则要输出前导空格 count++; } count+=s.length(); } } } cout<<endl;//这也是可能会PE的地方吧 return 0; }