实现一个简单的代码字计数器(二)

这一篇里让我们先实现基本功能,特性和改善放在后几篇实现里面。

要使一个函数执行一段代码的单词计数,让我们从设计它的接口开始。我们考虑输出的形式应该是什么样的:输入为code文本,输出应该为单词:出现次数,这样的话用map实现:

std::map<std::string, size_t> getWordCount(std::string const& code);

但是为了以后的设计,因为我们肯定需要知道出现次数最大的单词,如果用map的话需要设置迭代器遍历,通过it->second来判断大小,不够方便,所以这里采用vector来做可能更加妥当,因为之后还可以直接利用sort来进行排序操作:

std::vector<std::pair<std::string, size_t>> getWordCount(std::string const& code);

接下来要处理分隔符的问题,因为可能存在空格、逗号或者&&||->这样的符号,所以需要通过分割字符串将这些分隔符去掉:

bool isDelimiter(char c)
{
	auto const isAllowedInName = isalnum(c) || c == '_';
	return !isAllowedInName;
}


auto symbols = std::vector<std::string>{};
boost::split(symbols, code, isDelimiter);
symbols.erase(std::remove(begin(symbols), end(symbols), ""), end(symbols));

然后是计数函数:

std::map<std::string, size_t> countWords(std::vector<std::string> const& words)
{
	auto wordCount = std::map<std::string, size_t>{};
	for (auto const& word : words)
	{
		++wordCount[word];
	}
	return wordCount;
}

进行排序操作:

auto sortedWordCount = WordCount(begin(wordCount), end(wordCount));  //类型转换
	std::sort(begin(sortedWordCount), end(sortedWordCount), [](auto const& p1, auto const& p2) { return p1.second > p2.second; });

基本操作就完成了,那么我们怎么进行测试呢?先进行本地测试吧,在代码中嵌入一小段代码进行测试,那么问题来了,代码格式中可能有空格、引用符号、代码分散在几行,我们怎样才能做到解决这个问题呢?好在C++11中的原始字符串字面量正好解决了这类问题。我们可以写一个R在这些代码之前:R"(this is a code test)"

所有代码如下:

#include<iostream>
#include<iomanip>
#include<string>
#include<map>
#include<vector>
#include<iterator>
#include<boost/algorithm/string.hpp>

using WordCount = std::vector<std::pair<std::string, size_t>>;
WordCount getWordCount(std::string const& code);

bool isDelimiter(char c)
{
	auto const isAllowedInName = isalnum(c) || c == '_';
	return !isAllowedInName;
}

std::map<std::string, size_t> countWords(std::vector<std::string> const& words)
{
	auto wordCount = std::map<std::string, size_t>{};
	for (auto const& word : words)
	{
		++wordCount[word];
	}
	return wordCount;
}

WordCount getWordCount(std::string const& code)
{
	auto symbols = std::vector<std::string>{};
	boost::split(symbols, code, isDelimiter);
	symbols.erase(std::remove(begin(symbols), end(symbols), ""), end(symbols));

	auto const wordCount = countWords(symbols);

	auto sortedWordCount = WordCount(begin(wordCount), end(wordCount));  //类型转换
	std::sort(begin(sortedWordCount), end(sortedWordCount), [](auto const& p1, auto const& p2) { return p1.second > p2.second; });

	return sortedWordCount;
}

//void print(WordCount const& entries)
//{
//	for (auto const& entry : entries)
//	{
//		std::cout << std::setw(30) << std::left << entry.first << '|' << std::setw(10) << std::right << entry.second << '\n';
//	}
//}

void print(WordCount const& entries)
{
	if (entries.empty()) return;
	auto const longestWord = *std::max_element(begin(entries), end(entries), [](auto const& p1, auto const& p2) { return p1.first.size() < p2.first.size(); });
	auto const longestWordSize = longestWord.first.size();
	for (auto const& entry : entries)
	{
		std::cout << std::setw(longestWordSize + 1) << std::left << entry.first << '|' << std::setw(10) << std::right << entry.second << '\n';
	}
}

static constexpr auto code = R"(
bool isDelimiter(char c)
{
auto const isAllowedInName = isalnum(c) || c == '_';
return !isAllowedInName;
}
std::map<std::string, size_t> countWords(std::vector<std::string> const& words)
{
auto wordCount = std::map<std::string, size_t>{};
for (auto const& word : words)
{
++wordCount[word];
}
return wordCount;
}
WordCount getWordCount(std::string const& code)
{
auto symbols = std::vector<std::string>{};
boost::split(symbols, code, isDelimiter);
symbols.erase(std::remove(begin(symbols), end(symbols), ""), end(symbols));
auto const wordCount = countWords(symbols);
auto sortedWordCount = WordCount(begin(wordCount), end(wordCount));
std::sort(begin(sortedWordCount), end(sortedWordCount), [](auto const& p1, auto const& p2){ return p1.second > p2.second; });
return sortedWordCount;
}
})";

int main()
{
	print(getWordCount(code));
	system("pause");
}

结果如下:

1545027130251.png

posted @ 2018-12-17 14:13  MrYun  阅读(200)  评论(0编辑  收藏  举报