一道微软笔试题

上周末, 新鲜出炉的.

已知一个字符串, 只含常见可打印ascii字符以及空格和换行, 要求进行如下过滤:
1, 过滤掉前导空白和后导空白;
2, 中间的连续空白字符, 只保留一个;
3, 删除换行前后的空白字符;

题目不难, 不过按照微软一贯的作风, 这种题目的目的不是在于考察学生会不会写程序(当然, 要是写不出就不太好了), 而是在于考察学生是不是能够考虑到方方面面的问题, "于细微处见功力".

本着"测试先行"的原则, 可以先从测试用例入手, 如下所示:

// Test cases for leading & trailing spaces.
char arr00[] = "hello_world";
char arr01[] = "hello world";
char arr02[] = "  hello world";
char arr03[] = "hello world  ";
char arr04[] = "  hello world  ";
// Test cases for consecutive  spaces.
char arr05[] = "hello    world";
char arr06[] = "  hello    world  ";
// Test cases for spaces around new-lines.
char arr07[] = "hello  \n   world  ";
char arr08[] = "  hello    world   \n ";
char arr09[] = "\n  hello    world   \n ";
// Corner cases
char arr10[] = "   ";
char arr11[] = "\n";
char arr12[] = "  \n  ";

更多的测试用例, 希望读者可以补充.

有了这些测试用例, 再考虑如何实现. 从前往后走, 需要频繁地移动后面的内存, 不如从后往前走.

完整代码如下:

#include <stdio.h>
#include <assert.h>
#include <string.h>

// Test cases for leading & trailing spaces.
char arr00[] = "hello_world";
char arr01[] = "hello world";
char arr02[] = "  hello world";
char arr03[] = "hello world  ";
char arr04[] = "  hello world  ";
// Test cases for consecutive  spaces.
char arr05[] = "hello    world";
char arr06[] = "  hello    world  ";
// Test cases for spaces around new-lines.
char arr07[] = "hello  \n   world  ";
char arr08[] = "  hello    world   \n ";
char arr09[] = "\n  hello    world   \n ";
// Corner cases
char arr10[] = "   ";
char arr11[] = "\n";
char arr12[] = "  \n  ";

void filter_spaces(char *str, size_t len)
{
    char *dst   = str + len - 1;
    char *curr  = str + len - 1;

    while (*curr == ' ' && curr >= str)
        --curr; // remove trailing spaces;
    if (curr < str) { // all spaces.
        *str = '\0';
        return;
    }
    int after_space = 0;
    int around_newline = 0;
    while (curr >= str) {
        switch (*curr) {
        case ' ':
            if (after_space) { // a space followed by another space, omit it.
                --curr;
            } else if (around_newline) { // a space around a newline, omit it.
                --curr;
            } else {
                after_space = 1;
                *dst-- = *curr--;
            }
            break;
        case '\n':
            around_newline = 1;
            if (after_space) { // remove last recorded space.
                assert(*(dst + 1) == ' ');
                ++dst;
                after_space = 0;
            }
            *dst-- = *curr--;
            break;
        default: // other chars
            *dst-- = *curr--;
            after_space = 0;
            around_newline = 0;
            break;
        }
    }
    ++dst;
    if (*dst == ' ') // remove leading spaces.
        ++dst;
    // now the filtered string size is ( (str + size) - dst + 1 ),
    // including the trailing '\0'
    memmove(str, dst, (str + len) - dst + 1);
    
}

#define TEST_STR(str) do {\
    filter_spaces(str, strlen(str));\
    printf(#str ": \"%s\"\n", str);\
} while(0);

int main(int argc, char *argv[])
{
    TEST_STR(arr00);
    TEST_STR(arr01);
    TEST_STR(arr02);
    TEST_STR(arr03);
    TEST_STR(arr04);
    TEST_STR(arr05);
    TEST_STR(arr06);
    TEST_STR(arr07);
    TEST_STR(arr08);
    TEST_STR(arr09);
    TEST_STR(arr10);
    TEST_STR(arr11);
    TEST_STR(arr12);

    return 0;
}

注意最后的memmove, 因为这两块内存可能是重叠(overlap)的, 所以memcpy或者strcpy都不可行.

posted @ 2011-05-30 20:24  qsort  阅读(660)  评论(0编辑  收藏  举报