2022-北航敏捷软件工程-结对编程项目-最长英语单词链

项目	内容
这个作业属于哪个课程	2022 年北航敏捷软件工程
这个作业的要求在哪里	结对编程项目-最长英语单词链
我在这个课程的目标是	了解并体验软件工程，实现从「程序」到「软件」的进展。
这个作业在哪个具体方面帮助我实现目标	体验结对编程，初步实践工程化开发。

GitHub 项目地址

教学班级：周五班
项目地址：https://github.com/GrapeLemonade/two-thirds-of-icpc.git

PSP 表格 —— 预估

PSP2.1	Personal Software Process Stages	预估耗时（分钟）
Planning	计划
· Estimate	· 估计这个任务需要多少时间	10
Development	开发
· Analysis	· 需求分析 (包括学习新技术)	300
· Design Spec	· 生成设计文档	20
· Design Review	· 设计复审 (和同事审核设计文档)	20
· Coding Standard	· 代码规范 (为目前的开发制定合适的规范)	20
· Design	· 具体设计	200
· Coding	· 具体编码	1000
· Code Review	· 代码复审	120
· Test	· 测试 (自我测试，修改代码，提交修改)	1000
Reporting	报告
· Test Report	· 测试报告	40
· Size Measurement	· 计算工作量	20
· Postmortem & Process Improvement Plan	· 事后总结, 并提出过程改进计划	20
	合计	2770

接口设计相关思想

Information Hiding (信息隐藏)

计算机科学中，信息隐藏是将计算机程序中最有可能发生变化的部分与其他部分隔离开，从而让程序的其他部分在该部分发生变化时不用被大量修改。通常计算机程序员会提供一个的接口，这个接口对应的就是经常发生变化的内容。

我们的设计中，计算模块通过向外暴露如下六个接口与本组的 CLI、GUI 以及另一组的 GUI 进行交互，实现了计算模块内部信息的隐藏：

int gen_chain_word(const char* words[], int len, char* result[], char head, char tail, bool enable_loop);
int gen_chains_all(const char* words[], int len, char* result[]);
int gen_chain_word_unique(const char* words[], int len, char* result[]);
int gen_chain_char(const char* words[], int len, char* result[], char head, char tail, bool enable_loop);
int engine(const char* words[], int len, char* result[], char head, char tail, bool count, bool weighted, bool enable_self_loop, bool enable_ring);
const char* gui_engine(const char* input, int type, char head, char tail, bool weighted);

Interface Design (接口设计原则)

比较著名的是面向对象程序设计的六大设计原则，虽然我们并没有使用 C++ 面向对象，但我们有对其中的原则进行借鉴。例如单一职责原则，可以看到上述暴露出的六个接口均是单一职责，并没有存在过度糅合的情况。

Loose Coupling (松耦合)

松耦合的多个模块之间依赖性很低，进行修改时的代价远小于紧耦合的多个模块。

我们的计算模块和 GUI、CLI 均为松耦合，因而可以很方便地和其它组的相应模块进行互换。

计算模块接口的设计与实现过程

首先对所有单词去重，因为题目中要求同一个单词只能使用一次。

接着建立一个 \(26\) 个节点的有向图，每个节点代表一个字母。对于每个单词，设其首字母和尾字母分别为 \(h\) 和 \(t\)，添加一条有向边 \(s \to t\)。该有向图的每个长度大于 \(2\) 的 Trail (不能经过同一条边的路径) 均对应一个单词链。

我们定义 exDAG 为这样的有向图：每个点至多有一条指向自己的边，且去除所有自环后，图是 DAG。

对于七种参数，大致可以将其分为如下四种类型：

-n，仅能单独使用，计算所有 Trail，有向图必须是 exDAG。
-m，仅能单独使用，不考虑所有自环计算最长 Trail，每条边边权为 \(1\)，有向图必须是 exDAG。
-w，计算最长 Trail，每条边边权为 \(1\)，可以搭配如下三种参数使用：
- -h a，要求对应 Trail 的起点必须是字母 a。
- -t b，要求对应 Trail 的终点必须是字母 b。
- -r，若没有该参数则有向图必须是 exDAG。
-c，计算最长 Trail，每条边边权为单词长度，同样可以搭配上述三种参数使用。

我们实现了一个函数 int engine(const char* words[], int len, char* result[], char head, char tail, int type, bool weighted) 用于处理上述四种类型：

words 是传入每个单词的指针数组。
len 是传入单词的数量。
result 是存放所求结果的指针数组，同时该函数的返回值即为 result 对应数组的长度。
head 是对首字母的限制，若无限制为 0。
tail 是对尾字母的限制，若无限制为 0。
type 表示是上述哪种类型。
weighted 表示每条边的权值是否为单词长度。

该函数首先会调用 void init_words(const char* words[], int len) ，该函数将每个单词转化为对应的边。

接着会调用 void get_SCC()，该函数利用 Tarjan 算法求出有向图的每个强连通分量。

然后，如果要求有向图是 exDAG，会调用 void check_loop()，该函数先判断是否有一个点存在多于一个的自环，再判断是否存在一个强连通分量有多于两个点。若是，则抛出异常，该异常也会给出一个存在的单词环。

最后，根据要求类型的不同调用不同函数：

若要求输出所有的单词链，则调用 int get_all(char* result[])，该函数通过枚举起点，DFS 找到所有单词链，若单词链数量超过 \(20000\)，则会抛出异常。
若有向图是 exDAG，则调用 int get_max_DAG(char* result[], char head, char tail, bool enable_self_loop, bool weighted)。该函数的本质是在 DAG 上 dp 求最长路，这是一个很经典的问题，设 \(f_i\) 是以节点为 \(i\) 起点的最长路长度，对于一条边起点为 \(i\)、终点为 \(j\)、权值为 \(w\) 的边 \(e\)，通过逆序枚举拓扑序 (已经通过 Tarjan 算法求出)，转移方程为：

\[f_i \gets f_{j} + w \]
但该函数具体实现仍有许多改动的细节，列举如下：
- 如果限定了终点 \(i\)，对于所有 \(j \ne i\)，\(f_j\) 初始化为 \(-\infty\)，\(f_i\) 初始化为 \(0\)；否则均初始化为 \(0\)。
- 计算最大值的同时需要记录方案。
- 如果 enable_self_loop 为真，算完每个点 \(f_i\) 后要将其权值加到上面，同样需要改动方案的输出。
- 因为最终要求路径长度大于 \(2\)，最终统计答案时需要枚举第一条边 (\(i \to j\))，且如果限制了起点需要保证其合法性，接着将这条边的权值 \(+f_j\) 作为答案，此外如果枚举的第一条边是自环还要进行特判。
若有向图不要求为 exDAG，则调用 int get_max(char* result[], char head, char tail, bool weighted)。该函数的本质是状压 DP 求解一般有向图的最长 Trail。众所周知，一般有向图的最长 Trail 是一个 NP 问题，为保证正确性，我们没有采取近似算法。具体来说，设 \(f_{S,i}\) 是已使用的边集为 \(S\)，所在点为 \(i\)，以此为起始状态能经过的最长路长度，对于一条边起点为 \(i\)、终点为 \(j\)、权值为 \(w\) 的边 \(e\)，转移为：

\[f_{S,i} \gets f_{S \cup e,j}+w \]
具体实现上，因为最多只有 \(100\) 条边，可以使用 __int128 存储 \(S\)，但是我没有找到在 Visual Studio 中使用 __int128 的方法，因只需进行位运算，使用 pair<long long, long long> 代替。接着使用 map<pair<pair<long, long>, int>, int> 存储每个 \(f_{S,i}\)，进行记忆化搜索。

该函数具体实现仍有许多改动的细节，均类似于函数 get_max_DAG，在此不进行赘述。

engine 函数并不对外暴露，为方便和 CLI 与 GUI 交互，还对其进行了封装。

为和另一组的 GUI 进行交互，我们封装了四个函数，它们的内部逻辑均为调用 engine 函数：

int gen_chain_word(const char* words[], int len, char* result[], char head, char tail, bool enable_loop)
int gen_chains_all(const char* words[], int len, char* result[])
int gen_chain_word_unique(const char* words[], int len, char* result[])
int gen_chain_char(const char* words[], int len, char* result[], char head, char tail, bool enable_loop)

为和我们的 CLI 进行交互，同时方便 CLI 读取选项，对 engine 函数重载为 int engine(const char* words[], int len, char* result[], char head, char tail, bool count, bool weighted, bool enable_self_loop, bool enable_ring)，其内部逻辑也为调用原始 engine 函数。

为和我们的 GUI 进行交互，将 engine 函数封装为 const char* gui_engine(const char* input, int type, char head, char tail, bool weighted)，其内部逻辑同样为调用原始 engine 函数。

这六个函数都被封装到 core.dll。

UML 图

UML

计算模块接口部分的性能改进

DAG 部分已经是线性的复杂度了，无法进一步优化复杂度。

非 DAG 部分，作为 NP 问题，在保证正确性的前提下，仅能做对一些特殊数据进行优化：

如果当前点存在自环，优先走完全部自环。
如果存在重边，只走权值最大的那条。
\(S\) 中仅保留和 \(i\) 属于同一个强连通分量的边，具体实现为当 \(i\) 和 \(j\) 不在同一个强连通分量时，转移变为：

\[f_{S,i} \gets f_{\varnothing,j}+w_{i,j} \]

这些优化并不会改变该算法的时间复杂度，例如一个 \(10\) 个点的完全有向图，就可以令该算法达到指数级时间复杂度。

对于非构造样例，该算法的效率还是不错的，时间复杂度的瓶颈存在于每个强连通分量的内部，如果单个强连通分量内部的合法路径数量不是很多，可以很快计算出答案。

这里并没有额外花费时间进行更改，因为编码比较晚，所以编码之前已经全部想好。

以 \(5\) 个点的完全有向图为例，性能分析如下：

性能1.png

性能2.png

性能3.png

时间主要花费在 dfs_max 、以及记忆化搜索过程中通过 map 查找上，按照我们的设计，这也是必然的。

契约相关思想

Design by Contract (契约式编程)

契约式设计，一种设计计算机软件的方法。这种方法要求软件设计者为软件组件定义正式的，精确的并且可验证的接口，这样，为传统的抽象数据类型又增加了先验条件、后验条件和不变式。

OO 中有接触过类似的 JML，当时理论课上老师多次提出「形式化方法工程应用受限」，本次结对编程中，并没有这种很复杂的接口，几乎所有的接口看名字和参数就能确定其意思，因此很难真正使用契约式编程。如果强行使用契约式编程，形式意义远大于实际意义，反而会浪费时间。

Code Contract (代码契约)

Code Contract 是微软为 .NET 提供的一个契约式编程插件。本次结对编程中，并没有使用 .NET，因此没用使用该插件。

计算模块部分单元测试展示

计算模块部分单元测试主要有两部分组成：手工样例测试和对拍测试。

手工样例测试

主要是对题目中的样例、以及一些容易出错的小样例进行测试，下面以对 test_gen_chain_word 这个接口的测试为例。

首先通过函数 test 调用对应的接口，并和答案进行比对：

void test(const char* words[], int len, const char* ans[], int ans_len, char head, char tail, bool enable_loop){
    char** result = (char**)malloc(10000);
    int out_len = gen_chain_word(words, len, result, head, tail, enable_loop);
    Assert::AreEqual(ans_len, out_len);
    for(int i = 0;i < ans_len;i++){
        if(result != nullptr) Assert::AreEqual(strcmp(ans[i], result[i]), 0);
    }
}

其对应的测试样例如下：

/*
* -w 样例
*/ 
TEST_METHOD(example_w){
    const char* words[] = {"algebra", "apple", "zoo", "elephant", "under", "fox", "dog", "moon", "leaf", "trick", "pseudopseudohypoparathyroidism"};
    const char* ans[] = {"algebra", "apple", "elephant", "trick"}; 
    test(words, 11, ans, 4, 0, 0, false);
}

/*
* -h 样例
*/ 
TEST_METHOD(example_h){
    const char* words[] = {"algebra", "apple", "zoo", "elephant", "under", "fox", "dog", "moon", "leaf", "trick", "pseudopseudohypoparathyroidism"};
    const char* ans[] = {"elephant", "trick"};
    test(words, 11, ans, 2, 'e', 0, false);
}

/*
* -t 样例
*/ 
TEST_METHOD(example_t){
    const char* words[] = {"algebra", "apple", "zoo", "elephant", "under", "fox", "dog", "moon", "leaf", "trick", "pseudopseudohypoparathyroidism"};
    const char* ans[] = {"algebra", "apple", "elephant"};
    test(words, 11, ans, 3, 0, 't', false);
}

/*
* -h -t 同时使用
*/ 
TEST_METHOD(together_h_t){
    const char* words[] = {"algebra", "apple", "zoo", "elephant", "under", "fox", "dog", "moon", "leaf", "trick", "pseudopseudohypoparathyroidism"};
    const char* ans[] = {"algebra", "apple", "elephant"};
    test(words, 11, ans, 3, 'a', 't', false);
}

/*
* 长度为 5 的链
*/ 
TEST_METHOD(simple_chain){
    const char* words[] = {"ab", "bc", "cd", "de", "ef"};
    const char* ans[] = {"ab", "bc", "cd", "de", "ef"};
    test(words, 5, ans, 5, 0, 0, false);
}

/*
* 4 个点 6 条边的 DAG
*/ 
TEST_METHOD(max_DAG_4_vertices){
    const char* words[] = {"ab", "ac", "ad", "bc", "bd", "cd"};
    const char* ans[] = {"ab", "bc", "cd"};
    test(words, 6, ans, 3, 0, 0, false);
}

/*
* 3 个点的链 + 3 个自环
*/ 
TEST_METHOD(simple_chain_with_self_loop){
    const char* words[] = {"aa", "ab", "bb", "bc", "cc"};
    const char* ans[] = {"aa", "ab", "bb", "bc", "cc"};
    test(words, 5, ans, 5, 0, 0, false);
}

/*
* 3 个点的链 + 3 个自环 vs 4 个点的链
*/ 
TEST_METHOD(simple_chain_with_self_loop_vs_simple_chain){
    const char* words[] = {"aa", "ab", "bb", "bc", "cc", "de", "ef", "fg", "gh"};
    const char* ans[] = {"aa", "ab", "bb", "bc", "cc"};
    test(words, 9, ans, 5, 0, 0, false);
}

/*
* 5 个孤立自环点
*/ 
TEST_METHOD(isolated_vertex_with_self_loop){
    const char* words[] = {"aa", "bb", "cc", "dd", "ee"};
    const char* ans[] = {"none"};
    test(words, 5, ans, 0, 0, 0, false);
}

/*
* 长度为 5 的链 附带一些单个字母的单词
*/ 
TEST_METHOD(simple_chain_and_single_character){
    const char* words[] = {"ab", "bc", "cd", "de", "ea", "a", "b", "c", "d", "e"};
    const char* ans[] = {"ab", "bc", "cd", "de", "ea"};
    test(words, 10, ans, 5, 0, 0, true);
}

/*
* -r 样例
*/ 
TEST_METHOD(example_r){
    const char* words[] = {"element", "heaven", "table", "teach", "talk"};
    const char* ans[] = {"table", "element", "teach", "heaven"};
    test(words, 5, ans, 4, 0, 0, true);
}

/*
* 10 个 aa
*/ 
TEST_METHOD(only_self_loop_aa){
    const char* words[] = {"aa", "aa", "aa", "aa", "aa", "aa", "aa", "aa", "aa", "aa"};
    const char* ans[] = {"xx"};
    test(words, 10, ans, 0, 0, 0, true);
}

对拍测试

首先是数据生成器 (Generator)：

unsigned int rnd(){
	seed ^= seed << 13;
	seed ^= seed >> 7;
	seed ^= seed << 17;
	return seed;
}

const char** generator(int n, bool DAG, int len, unsigned int Seed){
	seed = Seed ^ n ^ len;
	char** words = (char**)malloc(len * sizeof(char*));
	for(int i = 0;i < len;i++){
		int len = rnd() % 10 + 3;
		words[i] = (char*)malloc((len + 1ll) * sizeof(char));
		if (words[i] != nullptr){
			words[i][0] = rnd() % n + 'a';
			words[i][1] = (char)(i + 'a');
			for(int j = 2;j < len;j++) words[i][j] = (char)(rnd() % n + 'a');
			if(DAG && words[i][0] >= words[i][len - 1]){
				if(words[i][0] == words[i][len - 1]){
					if(words[i][0] == n - 1 + 'a') words[i][0]--;
					else words[i][len - 1]++;
				}else std::swap(words[i][0], words[i][len - 1]);
			}
			words[i][len] = 0;
		}
	}
	return (const char**)words;
}

这里可以生成一个随机的单词序列，DAG 表示是否生成一个 DAG，len 表示需要生成的单词序列长度，Seed 则为传入的随机种子。这里使用了 EC Final 2020 中的一个随机数生成器，保证了 Generator 生成的数据仅和传入种子有关，单元测试不通过时很容易就可以复现。

接下来是两个暴力对拍程序，分别是 \(O(n \cdot n!)\) 暴力枚举单词顺序：

int brute_force(const char* words[], int len, char head, char tail, bool weighted){
	int a[10] = {0}, b[10] = {0};
	for(int i = 0;i < len;i++){
		a[i] = i;
		b[i] = (int)strlen(words[i]);
	}
	int ans = 0;
	do{
		if(head && words[a[0]][0] != head) continue;
		int sum = weighted ? b[a[0]] : 1;
		for(int i = 1;i < len;i++){
			if(words[a[i]][0] != words[a[i - 1]][b[a[i - 1]] - 1]) break;
			sum += weighted ? b[a[i]] : 1;
			if(!tail || words[a[i]][b[a[i]] - 1] == tail) ans = std::max(ans, sum);
		}
	}while(std::next_permutation(a, a + len));
	return ans;
}

以及 \(O\left(n^22^n\right)\) 的朴素状压 DP：

int f[1 << 20][20];
int dp(const char* words[], int len, char head, char tail, bool weighted){
	int b[20];
	for(int i = 0;i < len;i++) b[i] = (int)strlen(words[i]);
	for(int i = 0;i < (1 << len);i++) for(int j = 0;j < len;j++) f[i][j] = (int)-1e9;
	for(int i = 0;i < len;i++){
		if(!head || words[i][0] == head) f[1ll << i][i] = weighted ? b[i] : 1;
	}
	for(int i = 0;i < (1 << len);i++) for(int j = 0;j < len;j++) if(i & (1 << j)){
		for(int k = 0;k < len;k++) if(!(i & (1 << k))){
			if(words[j][b[j] - 1] == words[k][0]) f[i | (1ll << k)][k] = std::max(f[i | (1ll << k)][k], f[i][j] + (weighted ? b[k] : 1));
		}
	}
	int ans = 0;
	for(int i = 0;i < (1 << len);i++) for(int j = 0;j < len;j++) if(i & (1 << j)){
		if(i == (1 << j)) continue;
		if(!tail || words[j][b[j] - 1] == tail) ans = std::max(ans, f[i][j]);
	}
	return ans;
}

最后是输出正确性检验程序 (Checker)，检查答案中的每个单词是否均出现在原始的单词序列中：

void checker(const char* words[], int len, char* result[], int out_len){
	for(int i = 0;i < out_len;i++){
		bool tag = false;
		for(int j = 0;j < len;j++){
			if(strcmp(words[j], result[i]) == 0){
				tag = true;
				break;
			}
		}
		Assert::AreEqual(tag, true);
	}
}

仍以 test_gen_chain_word 这个接口的测试为例，介绍对拍函数。为保证单元测试速度较快，对于长度 \(\le 8\) 的数据，同时运行上述两个对拍程序，否则仅运行 \(O\left(n^22^n\right)\) 的对拍程序，判断 test_gen_chain_word 计算出的最大单词数和上述一个或两个对拍程序是否相同，最后调用 Checker 判断 test_gen_chain_word 计算出的单词序列是否合法。

void stress(int n, bool DAG, int len, unsigned int seed, char head, char tail){
    const char** words = generator(n, DAG, len, seed);
    char** result = (char**)malloc(10000);
    int out_len = gen_chain_word(words, len, result, head, tail, !DAG);
    int ans_len = dp(words, len, head, tail, false);
    if(len <= 8) Assert::AreEqual(ans_len, brute_force(words, len, head, tail, false));
    Assert::AreEqual(ans_len, out_len);
    checker(words, len, result, out_len);
}

该函数支持设置首字母和尾字母的范围 n、首字母限制 head 和尾字母限制 tail。

对四种可能的情况和各种参数进行遍历，同时对运行时间和数据强度进行折中后，得到如下的测试方法：

/*
* 对拍，head 和 tail 均无限制
*/ 
TEST_METHOD(stresses_0_0){
    for(int n = 2;n <= 26;n += 2){
        for(int len = 1;len <= 19;len += 3){
            for(int i = 0;i < 5;i++){
                stress(n, false, len, i, 0, 0);
                stress(n, true, len, i, 0, 0);
            }
        }
    }
}

/*
* 对拍，head 有限制 tail 无限制
*/ 
TEST_METHOD(stresses_1_0){
    for(int n = 2;n <= 26;n += 2){
        for(int len = 1;len <= 19;len += 3){
            for(int i = 0;i < 5;i++){
                char head = rnd() % n + 'a';
                stress(n, false, len, i, head, 0);
                stress(n, true, len, i, head, 0);
            }
        }
    }
}

/*
* 对拍，head 无限制 tail 有限制
*/ 
TEST_METHOD(stresses_0_1){
    for(int n = 2;n <= 26;n += 2){
        for(int len = 1;len <= 19;len += 3){
            for(int i = 0;i < 5;i++){
                char tail = rnd() % n + 'a';
                stress(n, false, len, i, 0, tail);
                stress(n, true, len, i, 0, tail);
            }
        }
    }
}

/*
* 对拍，head 和 tail 均有限制
*/ 
TEST_METHOD(stresses_1_1){
    for(int n = 2;n <= 26;n += 2){
        for(int len = 1;len <= 19;len += 3){
            for(int i = 0;i < 5;i++){
                char head = rnd() % n + 'a', tail = rnd() % n + 'a';
                stress(n, false, len, i, head, tail);
                stress(n, true, len, i, head, tail);
            }
        }
    }
}

代码覆盖率

覆盖率.png

异常处理

我们一共设置了 \(19\) 种异常。

命令行参数相关异常

参数过少

返回错误信息 missing arguments，单元测试如下：

TEST_METHOD(missing_arguments){
    try{
        const char* args[] = {"WordList.exe"};
        main_serve(1, args);
    }catch(invalid_argument const &e){
        Assert::AreEqual(0, strcmp("missing arguments", e.what()));
        return;
    }
    Assert::Fail();
}

重复选项

返回错误信息 option duplicated -- 'x'，其中 x 是重复的选项，单元测试如下：

TEST_METHOD(option_duplicated){
    try{
        const char* args[] = {"WordList.exe", "-h", "a", "-h", "a"};
        main_serve(5, args);
    }catch(invalid_argument const& e){
        Assert::AreEqual(0, strcmp("option duplicated -- 'h'", e.what()));
        return;
    }
    Assert::Fail();
}

`-h,-t` 缺少参数

返回错误信息 option requires an argument -- 'x'，其中 x 是缺少参数的选项，单元测试如下：

TEST_METHOD(missing_argument_head_tail){
    try{
        const char* args[] = {"WordList.exe", "-h", "a", "-t"};
        main_serve(4, args);
    }catch(invalid_argument const& e){
        Assert::AreEqual(0, strcmp("option requires an argument -- 't'", e.what()));
        return;
    }
    Assert::Fail();
}

`-h,-t` 参数长度大于 1

返回错误信息 argument of option '-x' should be a single alphabet, 'yyy' received，其中 x 是参数长度大于 1 的选项，yyy 是对应的参数，单元测试如下：

TEST_METHOD(argument_too_long_head_tail){
    try{
        const char* args[] = {"WordList.exe", "-h", "123"};
        main_serve(3, args);
    }catch(invalid_argument const& e){
        Assert::AreEqual(0, strcmp("argument of option -h should be a single alphabet, 123 received", e.what()));
        return;
    }
    Assert::Fail();
}

`-h,-t` 参数不是单个字母

返回错误信息 argument of option '-x' should be an alphabet, 'yyy' received，其中 x 是参数不是单个字母的选项，yyy 是对应的参数，单元测试如下：

TEST_METHOD(argument_not_alphabet_head_tail){
    try{
        const char* args[] = {"WordList.exe", "-h", "1"};
        main_serve(3, args);
    }catch(invalid_argument const& e){
        Assert::AreEqual(0, strcmp("argument of option -h should be an alphabet, 1 received", e.what()));
        return;
    }
    Assert::Fail();
}

非法选项

返回错误信息 invalid option -- 'x'，其中 x 是非法选项，单元测试如下：

TEST_METHOD(invalid_option){
    try{
        const char* args[] = {"WordList.exe", "-q", "123"};
        main_serve(3, args);
    }catch(invalid_argument const& e){
        Assert::AreEqual(0, strcmp("invalid option -- 'q'", e.what()));
        return;
    }
    Assert::Fail();
}

多个文件参数

返回错误信息 multiple text files given: "x" and "y"，其中 x 和 y 是两个文件参数，单元测试如下：

TEST_METHOD(too_many_file_arguments){
    try{
        const char* args[] = {"WordList.exe", "-n", "1.in", "2.in"};
        main_serve(4, args);
    }catch(invalid_argument const& e){
        Assert::AreEqual(0, strcmp("multiple text files given: \"1.in\" and \"2.in\"", e.what()));
        return;
    }
    Assert::Fail();
}

`-n` 不独立使用

返回错误信息 cannot combine -n with other options，单元测试如下：

TEST_METHOD(option_n_with_other){
    try{
        const char* args[] = {"WordList.exe", "-n", "1.in", "-r"};
        main_serve(4, args);
    }catch(invalid_argument const& e){
        Assert::AreEqual(0, strcmp("cannot combine -n with other options", e.what()));
        return;
    }
    Assert::Fail();
}

`-m` 和 `-h,-t,-r` 混合使用

返回错误信息 cannot combine -m with -h, -t or -r，单元测试如下：

TEST_METHOD(option_m_with_h_t_r){
    int cnt = 0;
    try{
        const char* args[] = {"WordList.exe", "-m", "1.in", "-h", "a"};
        main_serve(5, args);
    }catch(invalid_argument const& e){
        Assert::AreEqual(0, strcmp("cannot combine -m with -h, -t or -r", e.what()));
        cnt++;
    }
    try{
        const char* args[] = {"WordList.exe", "-m", "1.in", "-t", "a"};
        main_serve(5, args);
    }catch(invalid_argument const& e){
        Assert::AreEqual(0, strcmp("cannot combine -m with -h, -t or -r", e.what()));
        cnt++;
    }
    try{
        const char* args[] = {"WordList.exe", "-m", "1.in", "-r"};
        main_serve(4, args);
    }catch(invalid_argument const& e){
        Assert::AreEqual(0, strcmp("cannot combine -m with -h, -t or -r", e.what()));
        cnt++;
    }
    Assert::AreEqual(3, cnt);
}

`-m,-w,-c` 混合使用

返回错误信息 conflicting option combinations，单元测试如下：

TEST_METHOD(option_m_w_c_together){
    int cnt = 0;
    try{
        const char* args[] = {"WordList.exe", "-m", "-w", "1.in"};
        main_serve(4, args);
    }catch(invalid_argument const& e){
        Assert::AreEqual(0, strcmp("conflicting option combinations", e.what()));
        cnt++;
    }
    try{
        const char* args[] = {"WordList.exe", "-m", "-c", "1.in"};
        main_serve(4, args);
    }catch(invalid_argument const& e){
        Assert::AreEqual(0, strcmp("conflicting option combinations", e.what()));
        cnt++;
    }
    try{
        const char* args[] = {"WordList.exe", "-w", "-c", "1.in"};
        main_serve(4, args);
    }catch(invalid_argument const& e){
        Assert::AreEqual(0, strcmp("conflicting option combinations", e.what()));
        cnt++;
    }
    Assert::AreEqual(3, cnt);
}

`-n,-m,-w,-c` 均没有，但有其它选项

返回错误信息 missing -n, -w, -m or -c，单元测试如下：

TEST_METHOD(no_option_n_m_w_c){
    try{
        const char* args[] = { "WordList.exe", "-h", "a", "-t", "b", "-r"};
        main_serve(6, args);
    }catch(invalid_argument const& e){
        Assert::AreEqual(0, strcmp("missing -n, -w, -m or -c", e.what()));
        return;
    }
    Assert::Fail();
}

没有任何一个选项

返回错误信息 no option specified，单元测试如下：

TEST_METHOD(no_option_at_all){
    try{
        const char* args[] = { "WordList.exe", "1.in"};
        main_serve(2, args);
    }catch(invalid_argument const& e){
        Assert::AreEqual(0, strcmp("no option specified", e.what()));
        return;
    }
    Assert::Fail();
}

文件相关异常

文件不存在

返回错误信息 x: No such file，其中 x 是文件名称，单元测试如下：

TEST_METHOD(no_such_file){
    try{
        const char* args[] = {"WordList.exe", "-n", "no_such_file.in"};
        main_serve(3, args);
    }catch(runtime_error const& e){
        Assert::AreEqual(0, strcmp("no_such_file.in: No such file", e.what()));
        return;
    }
    Assert::Fail();
}

输入路径不是一个正常的文件

返回错误信息 x: Not a regular file，其中 x 是文件名称，单元测试如下：

TEST_METHOD(not_regular_file){
    try{
        create_directory("not_regular_file.in");
        const char* args[] = {"WordList.exe", "-n", "not_regular_file.in"};
        main_serve(3, args);
    }catch(runtime_error const& e){
        Assert::AreEqual(0, strcmp("not_regular_file.in: Not a regular file", e.what()));
        remove_all("not_regular_file.in");
        return;
    }
    Assert::Fail();
}

无法读取输入文件

返回错误信息 x: Cannot open as read-only，其中 x 是文件名称，单元测试如下：

TEST_METHOD(cant_read){
    try{
        const char* args[] = {"WordList.exe", "-n", "..\\..\\test\\data\\cant_read.txt"};
        main_serve(3, args);
    }catch(runtime_error const& e){
        Assert::AreEqual(0, strcmp("..\\..\\test\\data\\cant_read.txt: Cannot open as read-only", e.what()));
        return;
    }
    Assert::Fail();
}

需要注意的是，在 Windows 环境下这个异常的单元测试比较特殊，我们采取的方式是建立一个文件 test/data/cant_read.txt，其内容包含字母，之后右键点击该文件，选择「属性」，再选择「安全」，点击「编辑」，将 Authenticated Users 的权限「读取」设置为「拒绝」。

由于 Git 无法读取该文件，该文件无法推送到 GitHub 上，助教若想验证该单元测试，请按照上面的方法进行创建。

输入文件中没有单词

返回错误信息 x: File does not contain words，其中 x 是文件名称，单元测试如下：

TEST_METHOD(no_words){
    try{
        ofstream output;
        output.open("no_words.in", ios::out | ios::binary | ios::trunc);
        output.close();
        const char* args[] = {"WordList.exe", "-n", "no_words.in"};
        main_serve(3, args);
    }catch(runtime_error const& e){
        Assert::AreEqual(0, strcmp("no_words.in: File does not contain words", e.what()));
        return;
    }
    Assert::Fail();
}

无法写 `solution.txt`

返回错误信息 solution.txt: Cannot open for writing，单元测试如下：

TEST_METHOD(cant_write){
    try{
        ofstream output;
        output.open("cant_write.in", ios::out | ios::binary | ios::trunc);
        output << "abc" << endl;
        output.close();
        create_directory("solution.txt");
        const char* args[] = {"WordList.exe", "-w", "cant_write.in"};
        main_serve(3, args);
    }catch(runtime_error const& e){
        Assert::AreEqual(0, strcmp("solution.txt: Cannot open for writing", e.what()));
        remove_all("solution.txt");
        return;
    }
    Assert::Fail();
}

运行相关异常

需要特别说明的是，为方便与另一组成员对接，四个相关接口内部会 catch 异常时并会返回 \(-1\)，因此下面的单元测试均直接调用 engine 函数。

存在单词环

返回错误信息 Word ring detected: xxx，其中 xxx 是一个单词环，单元测试如下：

/*
* 多于一个的自环，两个 aa
*/ 
TEST_METHOD(more_than_one_self_loop){
    try{
        const char* words[] = {"aa", "aa"};
        char** result = (char**)malloc(10000);
        engine(words, 2, result, 0, 0, true, false, true, false);
    }catch(std::logic_error const& e){
        Logger::WriteMessage(e.what());
        Assert::AreEqual(0, strcmp("Word ring detected: aa aa", e.what()));
        return;
    }
    Assert::Fail();
}

/*
* 非 DAG，两个点的环
*/ 
TEST_METHOD(not_dag_two_vectices){
    try{
        const char* words[] = {"ab", "ba"};
        char** result = (char**)malloc(10000);
        engine(words, 2, result, 0, 0, true, false, true, false);
    }catch(std::logic_error const& e){
        Logger::WriteMessage(e.what());
        Assert::AreEqual(0, strcmp("Word ring detected: ba ab", e.what()));
        return;
    }
    Assert::Fail();
}

/*
* 非 DAG，三个点的环
*/ 
TEST_METHOD(not_dag_three_vectices){
    try{
        const char* words[] = {"ab", "bc", "ca"};
        char** result = (char**)malloc(10000);
        engine(words, 3, result, 0, 0, true, false, true, false);
    }catch(std::logic_error const& e){
        Logger::WriteMessage(e.what());
        Assert::AreEqual(0, strcmp("Word ring detected: bc ca ab", e.what()));
        return;
    }
    Assert::Fail();
}

`-n` 超过 20000 个单词链

返回错误信息 Word ring detected: xxx，其中 xxx 是一个单词环，单元测试如下：

TEST_METHOD(too_many_word_chains){
    try{
        const char* words[] = {"ab", "abb", "abbb", "bc", "bcc", "bccc", "cd", "cdd", "cddd", "de", "dee", "deee", "ef", "eff", "efff", "fg", "fgg", "fggg", "gh", "ghh", "ghhh", "hi", "hii", "hiii", "ij", "ijj", "ijjj", "jk", "jkk", "jkkk"};
        char** result = (char**)malloc(10000);
        engine(words, 30, result, 0, 0, true, false, true, false);
    }catch(std::logic_error const& e){
        Logger::WriteMessage(e.what());
        Assert::AreEqual(0, strcmp("Too many word chains!", e.what()));
        return;
    }
    Assert::Fail();
}

GUI 模块设计

所用技术

我们构建 GUI 版本应用所使用的技术栈如下：

打包所用的插件是 vue-cli-plugin-electron-builder，其使用的打包引擎是 electron-builder。

此外，我们还使用了其他工具包：

设计风格

我们使用的 Vuetify 提供的是 Material Design 风格的界面。在其基础上，我们构建了间接的板块风格的界面。效果如下：

gui1

其中左上板块为控制板块，左下板块是输入板块，右侧板块是输出板块。

界面构建

我们使用 Vue 框架和 Vuetify 提供的组件快速构建应用界面，下面举例说明实现方法。

template 部分的例子是我们的模式选择界面，它使用了 Vuetify 提供的 v-list-item-group，展示了四种可选模式：

<v-list dark class="rounded-l pa-0 primary" style="height: 100%">
    <v-list-item-group
        mandatory
        active-class="indicator"
        v-model="selectedMode"
        style="height: 100%"
    >
        <v-list-item
            v-for="(item, i) in modes"
            :key="i"
            style="height: 25%"
        >
        <v-list-item-content>
            <v-list-item-title class="text-center">
            {{ item }}
            </v-list-item-title>
        </v-list-item-content>
        </v-list-item>
    </v-list-item-group>
</v-list>

其效果：
gui2

script 部分的例子是三个小函数，分别用来显示弹窗信息、清空两个文本框。

export default {
  name: 'App',
  data: () => ({
    inputText: '',
    outputText: '',
    snackbar: false,
    timeout: 3000,
    reportText: '',
    // ...
  }),
  methods: {
    reportError(msg) {
      this.reportText = msg
      this.snackbar = true
    },
    clearInputText() {
      this.inputText = ''
    },
    clearOutputText() {
      this.outputText = ''
    },
    // ...
  }
  // ...
}

style 部分的例子是 modes-options 类的格式，它用于控制可选选项的布局：

.mode-options {
  display: grid !important;
  grid-template-rows: repeat(3, minmax(0, 1fr));
  margin: 0 !important;
}

其效果在上面界面截图中已展示。

前后端对接

我们使用 node-ffi-napi 库完成 Node.js 环境与 dll 的对接。

动态库声明

实现方法如下：

const ffi = window.require('ffi-napi')
const corePtr = ffi.DynamicLibrary(path.resolve('./core.dll')).get('gui_engine')
const core = ffi.ForeignFunction(corePtr, 'string', 
				['string', 'int', 'char', 'char', 'bool'])

异步调用

我们在 Vue 中为"求解"按钮的 onclick 事件绑定了以下的 handler：

solve() {
      this.calculating = true				// 暂时禁用按钮
      this.outputText = ''
      this.runMessage = ''
      let start = moment()
      core.async(
          this.inputText, 					// 传参
          [0, this.allowRing ? 3 : 1, 2, this.allowRing ? 3 : 1][this.selectedMode],
          this.noAvailableOptions || !this.head ? 0 : this.head.charCodeAt(0),
          this.noAvailableOptions || !this.tail ? 0 : this.tail.charCodeAt(0),
          this.selectedMode === 3,
          (e, d) => {
            if (e) this.reportError(e)		// 报错处理
            if (/^WordList-GUI: /.test(d)) {
              this.reportError(d.substring(14))
            } else {
              this.outputText = d
              this.runMessage = '计算用时：' + moment().diff(start) + 'ms'
            }
            this.calculating = false		// 重新启用按钮
          }
      )
    }

实现效果

https://live.csdn.net/v/embed/195613

松耦合实践：模块对调与对接

我们与另一个小组进行了模块对调工作，他们是 aalex1945 和 lyyf（这里为尽量保障大家的信息安全，并不列出他们的学号、姓名）。

对接效果见仓库的 dev-combine 分支。

对接过程中遇到的主要问题及其解决方案如下：

问题：接口不一致
- 我们并未使用课程组约定的标准接口，最终他们适配了我们的接口
问题：空间分配方法不同
- 进行简单接口适配
问题：生成目标字长不匹配
- 统一生成 x86_64 架构目标
问题：使用的语言标准版本不一致
- 统一使用 C++ 20（-std=c++2a）
问题：接口的 cv 限定冲突
- 确保安全的清空下使用 const_cast<> 进行转换
问题：异常处理手段不一致
- 我们将 throw 改为返回负值

结对的过程

我们选择在三号楼 213 旁边的公共学习区进行结对编程，两次结对编程均可以享受美妙的钢琴声，包括但不限于 Unravel、彩云追月、卡农、グランドエスケープ。

结对图片如下：

结对编程总结

结对编程优缺点分析

优点：遇到问题可以两人共同讨论，避免一个人陷入误区后自闭。
缺点：如果任务的难点不在于编码本身，那么结对编程不如两人分工效率高。

两人优缺点分析

pantw
- 优点：非常熟悉工程开发，非常熟悉 C++，非常熟悉前端开发。（全靠潘佬带我 \(\text{orz}\)
- 缺点：作息略微颠倒，这样可能不太健康（？
JJLeo
- 优点：对相关算法较为熟悉，代码实现能力强，熟悉算法竞赛方面 C++ 的使用。
- 缺点：极度不熟悉工程开发，经常一晚上都在编译失败，或是被 Git 爆杀。

PSP 表格 —— 实际

PSP2.1	Personal Software Process Stages	预估耗时（分钟）	实际耗时（分钟）
Planning	计划
· Estimate	· 估计这个任务需要多少时间	10	15
Development	开发
· Analysis	· 需求分析 (包括学习新技术)	300	600
· Design Spec	· 生成设计文档	20	30
· Design Review	· 设计复审 (和同事审核设计文档)	20	30
· Coding Standard	· 代码规范 (为目前的开发制定合适的规范)	20	30
· Design	· 具体设计	200	200
· Coding	· 具体编码	1000	1200
· Code Review	· 代码复审	120	150
· Test	· 测试 (自我测试，修改代码，提交修改)	1000	2000
Reporting	报告
· Test Report	· 测试报告	40	80
· Size Measurement	· 计算工作量	20	20
· Postmortem & Process Improvement Plan	· 事后总结, 并提出过程改进计划	20	20
	合计	2770	4375

小小的吐槽

能不能不要再考察性能分了，这种给一个 NP 问题，让大家去优化，又不给数据的模式，为什么六个学期四个学期都有啊？

指导书为什么写的这么离谱啊？这么一个简单的问题 (指求解单词链及其衍生问题) 为什么会有这么多令人疑惑的地方？将问题描述表达清楚很难吗？

posted @ 2022-04-05 23:38 JJLeo 阅读(375) 评论(0) 编辑收藏举报

刷新页面返回顶部

Loading

JJLeo