一道微软笔试题
上周末, 新鲜出炉的.
已知一个字符串, 只含常见可打印ascii字符以及空格和换行, 要求进行如下过滤:
1, 过滤掉前导空白和后导空白;
2, 中间的连续空白字符, 只保留一个;
3, 删除换行前后的空白字符;
题目不难, 不过按照微软一贯的作风, 这种题目的目的不是在于考察学生会不会写程序(当然, 要是写不出就不太好了), 而是在于考察学生是不是能够考虑到方方面面的问题, "于细微处见功力".
本着"测试先行"的原则, 可以先从测试用例入手, 如下所示:
// Test cases for leading & trailing spaces. char arr00[] = "hello_world"; char arr01[] = "hello world"; char arr02[] = " hello world"; char arr03[] = "hello world "; char arr04[] = " hello world "; // Test cases for consecutive spaces. char arr05[] = "hello world"; char arr06[] = " hello world "; // Test cases for spaces around new-lines. char arr07[] = "hello \n world "; char arr08[] = " hello world \n "; char arr09[] = "\n hello world \n "; // Corner cases char arr10[] = " "; char arr11[] = "\n"; char arr12[] = " \n ";
更多的测试用例, 希望读者可以补充.
有了这些测试用例, 再考虑如何实现. 从前往后走, 需要频繁地移动后面的内存, 不如从后往前走.
完整代码如下:
#include <stdio.h> #include <assert.h> #include <string.h> // Test cases for leading & trailing spaces. char arr00[] = "hello_world"; char arr01[] = "hello world"; char arr02[] = " hello world"; char arr03[] = "hello world "; char arr04[] = " hello world "; // Test cases for consecutive spaces. char arr05[] = "hello world"; char arr06[] = " hello world "; // Test cases for spaces around new-lines. char arr07[] = "hello \n world "; char arr08[] = " hello world \n "; char arr09[] = "\n hello world \n "; // Corner cases char arr10[] = " "; char arr11[] = "\n"; char arr12[] = " \n "; void filter_spaces(char *str, size_t len) { char *dst = str + len - 1; char *curr = str + len - 1; while (*curr == ' ' && curr >= str) --curr; // remove trailing spaces; if (curr < str) { // all spaces. *str = '\0'; return; } int after_space = 0; int around_newline = 0; while (curr >= str) { switch (*curr) { case ' ': if (after_space) { // a space followed by another space, omit it. --curr; } else if (around_newline) { // a space around a newline, omit it. --curr; } else { after_space = 1; *dst-- = *curr--; } break; case '\n': around_newline = 1; if (after_space) { // remove last recorded space. assert(*(dst + 1) == ' '); ++dst; after_space = 0; } *dst-- = *curr--; break; default: // other chars *dst-- = *curr--; after_space = 0; around_newline = 0; break; } } ++dst; if (*dst == ' ') // remove leading spaces. ++dst; // now the filtered string size is ( (str + size) - dst + 1 ), // including the trailing '\0' memmove(str, dst, (str + len) - dst + 1); } #define TEST_STR(str) do {\ filter_spaces(str, strlen(str));\ printf(#str ": \"%s\"\n", str);\ } while(0); int main(int argc, char *argv[]) { TEST_STR(arr00); TEST_STR(arr01); TEST_STR(arr02); TEST_STR(arr03); TEST_STR(arr04); TEST_STR(arr05); TEST_STR(arr06); TEST_STR(arr07); TEST_STR(arr08); TEST_STR(arr09); TEST_STR(arr10); TEST_STR(arr11); TEST_STR(arr12); return 0; }
注意最后的memmove, 因为这两块内存可能是重叠(overlap)的, 所以memcpy或者strcpy都不可行.