第三次作业

1、地址：

　　git地址：https://github.com/JPL1988/WordCount.git

　　结对伙伴的作业地址：https://www.cnblogs.com/1175050954dj/p/10651865.html

2、结对过程

https://www.cnblogs.com/1175050954dj/p/10651865.html

3、 PSP表格

PSP2.1	Personal Software Process Stages	预估耗时（分钟）	实际耗时（分钟）
Planning	计划	10	15
· Estimate	· 估计这个任务需要多少时间
Development	开发	120	150
· Analysis	· 需求分析 (包括学习新技术)	60	45
· Design Spec	· 生成设计文档	30	20
· Design Review	· 设计复审 (和同事审核设计文档)	30	15
· Coding Standard	· 代码规范 (为目前的开发制定合适的规范)	30	10
· Design	· 具体设计	30	30
· Coding	· 具体编码	20	20
· Code Review	· 代码复审	20	20
· Test	· 测试（自我测试，修改代码，提交修改）	60	80
Reporting	报告	30	25
· Test Report	· 测试报告	30	20
· Size Measurement	· 计算工作量	20	15
· Postmortem & Process Improvement Plan	· 事后总结, 并提出过程改进计划	20	20
	合计	510	430

4、解题思路

1，通过命令行启动程序，识别文件

2，把文件全部读取进一个字符串，通过字符串长度求文件字符数

3，遍历字符串，同时统计单词总数。

4，将字符串分割为单个字符串，用dictionary储存字符串和字符串出现次数

5，遍历dictionary一次找到一个出现频率最高的字符串，然后找到10个字符串

6，通过比较字符大小进行字典排序

5、设计实现过程

类图

断点测调试代码对单词统计是否正确

6、代码规范链接

https://blog.csdn.net/qq_31606375/article/details/77783328

7、性能改进

在分割字符串时，我采用的遍历在分割字符串，由于在统计单词数时就已经遍历了所有文本，然后在统计频率最高的10个单词是又对文本遍历分割字符串。所以在性能改进时，可以先用正则表达式分割字符串后，再统计单词数，可以减少一次遍历。

8、代码说明

统计单词个数

public int ComputeWords(string FileTxt)
        {
            int count = 0;
            int result = 0;
            int i = 0;
            //遍历整个文件字符串
            while (i<FileTxt.Length)
            {
　　　　　　　　　　//去除换行符
                if(FileTxt[i] == '\n')
                {
                    FileTxt.Remove(i,1);
                }
                //若不是文件分隔符，则单词长度加1，否则判断前字符串是否是单词
                if(FileTxt[i]==' '||(FileTxt[i]>'0'&&FileTxt[i]<'9'))
                {
　　　　　　　　　　　　//如果是数字则遍历至下一个字母
                    while ((FileTxt[i] > '0' && FileTxt[i] < '9'))
                    {
                        i++;
                        count++;
                    }
　　　　　　　　　　　　//如果单词长度大于4单词总数+1
                    if (count >= 4)
                    {
                        result++;
                    }
　　　　　　　　　　　　//如果当前字符是字母，字符串的字母数回退到1，否则回退到零
                    if (FileTxt[i] == ' ')
                        count = 0;
                    else
                        count = 1;
                }
                else
                {
                    count++;
                }
                i++;
            }
            return result;
        }

统计dictionary中出现频率最高的10个单词

public string[] CountTimes()
        {
            int temp = 0;
            //计数器计算string的下标
            int j = 0;
            //初始化索引数组
            string[] strings = new string[10];
            for (int i = 0; i < 10&&dictionary.Count>0; i++)
            {               
                //找到当前集合中出现频率最高的一个单词的索引
                foreach(int x in dictionary.Values)
                {
                    if (x > Index[i])
                    {
                        temp = j;
                        Index[i] = x;
                    }                        
                    j++;
                }
                j = 0;
                //遍历keys找到该索引处的字符串
                foreach(string s in dictionary.Keys)
                {
                    if (j == temp)
                    {
                        strings[i] = s;
                        break;
                    }                       
                    j++;
                }
                //在集合中删除对象
                dictionary.Remove(strings[i]);
                j = 0;
                temp = 0;
            }
            return strings;
        }

9、收获

在结对编程的过程中学会了讨论，分析，相互提出意见，改进代码。同时也学会了git来统计代码版本，更加熟悉git。

posted on 2019-04-03 19:00 l123456l 阅读(224) 评论(1) 编辑收藏举报

刷新页面返回顶部

l123456l