WordCount小程序及测试

Github项目地址：https://github.com/792450735/wc

PSP表格：

PSP2.1表格[1]

PSP2.1	PSP阶段	预估耗时（分钟）	实际耗时（分钟）
Planning	计划	15	20
· Estimate	· 估计这个任务需要多少时间	870	1090
Development	开发	800	1040
· Analysis	· 需求分析 (包括学习新技术)	90	90
· Design Spec	· 生成设计文档	30	30
· Design Review	· 设计复审 (和同事审核设计文档)	10	10
· Coding Standard	· 代码规范 (为目前的开发制定合适的规范)	10	10
· Design	· 具体设计	90	70
· Coding	· 具体编码	500	700
· Code Review	· 代码复审	50	60
· Test	· 测试（自我测试，修改代码，提交修改）	50	60
Reporting	报告	70	50
· Test Report	· 测试报告	20	20
· Size Measurement	· 计算工作量	20	10
· Postmortem & Process Improvement Plan	· 事后总结, 并提出过程改进计划	30	20
	合计	885	1110

解题思路：

拿到题目后，首先看大致的需求：

基本功能

wc.exe -c file.c //返回文件 file.c 的字符数

wc.exe -w file.c //返回文件 file.c 的单词总数

wc.exe -l file.c //返回文件 file.c 的总行数

wc.exe -o outputFile.txt //将结果输出到指定文件outputFile.txt

扩展功能

wc.exe -s //递归处理目录下符合条件的文件

wc.exe -a file.c //返回更复杂的数据（代码行 / 空行 / 注释行）

wc.exe -e stopList.txt // 停用词表，统计文件单词总数时，不统计该表中的单词

大致明确了要求，需要用java语言编写，需要学习：java如何对文件进行操作[2]；java的基本语法；java对字符、字符串、数组等的操作[3]；将程序打包成.exe文件并将配置环境打包进去[4]等。

程序设计实现过程：

首先决定把功能分为三块：计算字符数、单词数、总行数和停用词放在一块（wc（））；-a功能另放一块（hard（））；-o功能也一块（oout（））。

可以把程序只用一个主类，上面三块为类里的三个方法，并且由于方法之间需要数据通信，所以将记录的数据设为public static，并且另写一个初始化函数init（），当读多个文件时，每读一个后，重新初始化数据。

代码说明：

关键代码分为三个核心方法，一个初始化方法，一个main函数。

主类里的数据申明：

public static int words=0;//单词数

public static int lines=1;//行数

public static int chars=0,t1=0,t2=0,t3=0;//t1,t2,t3分别为代码行，空行，注释行

public static ArrayList<String> fname=new ArrayList<String>();//一个或多个要读取的.c文件组。

public static String outname="result.txt";//输出文件的文件名

public static String stopList;//停用词表

main函数中：通过设一个flags数组来记录用户输入的命令，然后根据数组来确定调用哪些函数，同时也将命令按指定的顺序排列好，可以处理用户的乱序输入。

int[] flags=new int[]{0,0,0,0,0};

for(int i=0;i<args.length;i++){

if(args[i].equals("-c"))

flags[0]=1;

if(args[i].equals("-w"))

flags[1]=1;

if(args[i].equals("-l"))

flags[2]=1;

if(args[i].equals("-a"))

flags[4]=1;

if(!args[i].equals("-c")&&!args[i].equals("-w")&&!args[i].equals("-l")&&!args[i].equals("-o")&&!args[i].equals("-a")&&!args[i].equals("-e")&&!args[i].equals("-s")){

fname.add(args[i]);

}

if(args[i].equals("-o")){

outname=args[++i];

}

if(args[i].equals("-e")){

stopList=args[++i];

flags[3]=1;

}

函数public static void wc(String ff,int[] use)计算字符数，单词数，行数，包括停用词表。

计算字符数，单词数，行数的核心代码如下：每读一个字符，chars就加一，每读到\n，lines就加一，计算单词时则要立个flag，当读到非字符，并且非符号字符之前的字符是符号字符，则words加一。

while((c=f.read())!=-1){

chars++;

if(c=='\n'){

lines++;

chars=chars-2;

}

if(whiteSpace.indexOf(c)==-1){

newlist.append((char)c);

}

if(whiteSpace.indexOf(c)!=-1){

if(lastNotWhite){

words++;

for(int j=0;j<stopwords.size();j++){

if(newlist.toString().equals(stopwords.get(j)))

words--;

}

newlist.delete(0, newlist.length());

}

lastNotWhite=false;

lastisword=true;

}

else{

lastisword=false;

lastNotWhite=true;

}

if(!lastisword){

for(int j=0;j<stopwords.size();j++){

if(newlist.toString().equals(stopwords.get(j)))

words--;

}

如果有使用停用词表的命令，则读取停用词表指定的文件，并使用一个数组存停用词，如下：

while((c=fs.read())!=-1){

if(whiteSpace.indexOf(c)==-1){

list.append((char)c);

}

if(whiteSpace.indexOf(c)!=-1){

if(lastNotWhite){

stopwords.add(list.toString());

list.delete(0, list.length());

}

lastisword=true;

lastNotWhite=false;

}

else{

lastisword=false;

lastNotWhite=true;

}

if(!lastisword){

stopwords.add(list.toString());

}

lastisword=false;

lastNotWhite=false;

这样，当读到单词时，遍历停用词数组，如果匹配则words不增加：

if(whiteSpace.indexOf(c)!=-1){

if(lastNotWhite){

words++;

for(int j=0;j<stopwords.size();j++){

if(newlist.toString().equals(stopwords.get(j)))

words--;

}

newlist.delete(0, newlist.length());

}

lastNotWhite=false;

lastisword=true;

}

else{

lastisword=false;

lastNotWhite=true;

}

函数public static void oout(String f,String ff,int[] use)就是对指定文件的创建和写入，没什么特别的。

函数 public static void hard(String f)响应-a指令，要求是：

代码行：本行包括多于一个字符的代码。

空行：本行全部是空格或格式控制字符，如果包括代码，则只有不超过一个可显示的字符，例如“{”。

注释行：本行不是代码行，并且本行包括注释。一个有趣的例子是有些程序员会在单字符后面加注释：

}//注释

在这种情况下，这一行属于注释行。

根据需求，我认为需要按行读取文件，并且立个flag，当读到第一个是字符，flag置为1，当第二个还是字符，则可以判定是代码行，当第二个是“/”，并且其后还是“/”，则判定为注释行，判定结束后，如果flag没变（仍是0），则为空行。

代码如下：

File ff=new File(f);

InputStreamReader fff=new InputStreamReader(new FileInputStream(f));

BufferedReader freader=new BufferedReader(fff);

String l=null;

int isflag=0;//isflag=10为代码行，=11为注释行，=0为空行

while((l=freader.readLine())!=null){

isflag=0;

for(int i=0;i<l.length();i++){

if(l.charAt(i)!=' '&&l.charAt(i)!='/'&&isflag==0&&l.charAt(i)!='\t')

{isflag=1;continue;}

if(l.charAt(i)!=' '&&l.charAt(i)!='/'&&isflag==1)

{isflag=10;break;}

if(l.charAt(i)=='/'&&isflag==0)

{isflag=2;continue;}

if(l.charAt(i)=='/'&&isflag==1)

{isflag=2;continue;}

if(l.charAt(i)=='/'&&isflag==2)

{isflag=11;break;}

}

if(isflag==10||isflag==1)

t1++;

else if(isflag==11)

t3++;

else

t2++;

}

System.out.println(f+",代码行/空行/注释行："+t1+"/"+t2+"/"+t3);

}

测试设计过程：

测试需要尽可能地覆盖所有可能，WordCount程序的高风险地方在于许多不常用的，判断比较模糊的边界，比如：\n,\r,\t算不算字符，停用词表是否能被正确停用，空文件的读取，字符后面的注释算代码行还是注释行，注释后面接代码是注释行还是代码行等等。

我的测试用例如下：

1：测试-c功能

wc.exe -c test1.c

期望输出：

test1.c,字符数：21

2：测试-l功能

wc.exe -l test1.c

期望输出：

test1.c,行数：7

3：测试-w功能

wc.exe -w test1.c

期望输出：

test1.c,单词数：8

4：测试-a功能

wc.exe -a test1.c

期望输出：

test1.c,代码行/空行/注释行：4/1/2

5：测试是否能按要求顺序输出：

wc.exe -a -l -w -c test1.c

期望输出：

test1.c,字符数：21

test1.c,单词数：8

test1.c,行数：7

test1.c,代码行/空行/注释行：4/1/2

6：测试-o功能：

wc.exe -c test1.c -o output.txt

期望输出：

成功创建output.txt，里面内容为

test1.c,字符数：21

7：测试能否按要求顺序输出到指定文件：

wc.exe -a -l -w -c test1.c -o output.txt

期望输出：

output.txt中内容为：

test1.c,字符数：21

test1.c,单词数：8

test1.c,行数：7

test1.c,代码行/空行/注释行：4/1/2

8：测试-e功能：

wc.exe -w test1.c -e stop.txt

期望输出：

test1.c,单词数：6

9：测试单/多字符后接注释的判断：

wc.exe -a test2.c

期望输出：

test2.c,代码行/空行/注释行：3/1/0

10：测试空文本的输出：

wc.exe -a -l -w -c test3.c

期望输出：

test3.c,字符数：0

test3.c,单词数：0

test3.c,行数：1

test3.c,代码行/空行/注释行：0/1/0

参考文献连接：

[1]: http://www.cnblogs.com/xinz/archive/2011/10/22/2220872.html

[2]: http://www.cnblogs.com/fnlingnzb-learner/p/6010165.html

[3]: http://blog.csdn.net/Maxiao1204/article/details/52880308

[4]: http://blog.csdn.net/sunkun2013/article/details/13167099

posted @ 2018-03-19 16:51 Nathon 阅读(382) 评论(2) 编辑收藏举报

刷新页面返回顶部

Nathon

WordCount小程序及测试

公告