第十一章关联容器

关联容器和顺序容器的不同：关联容器中的元素时按照关键字来保存和访问的。
关联容器支持通过关键字来高效地查找和读取元素，基本的关联容器类型是 map和 set。

关联容器类型：

容器类型	解释
按顺序存储
`map`	关键数组：保存`关键字-值`对
`set`	关键字即值，即只保存关键字的容器
`multimap`	支持同一个键多次出现的`map`
`multiset`	支持同一个键多次出现的`set`
无序集合
`unordered_map`	用哈希函数组织的`map`
`unordered_set`	用哈希函数组织的`set`
`unordered_multimap`	哈希组织的`map`，关键字可以重复出现
`unordered_multiset`	哈希组织的`set`，关键字可以重复出现

练习11.1

描述map和vector的不同。

解：

map 是关联容器， vector 是顺序容器。

练习11.2

分别给出最适合使用list、vector、deque、map以及set的例子。

解：

list：双向链表，适合频繁插入删除元素的场景。
vector：适合频繁访问元素的场景。
deque：双端队列，适合频繁在头尾插入删除元素的场景。
map：字典。
set：适合有序不重复的元素的场景。

练习11.3

编写你自己的单词计数程序。

解：

#include <string>
#include <map>
#include <iostream>

using namespace std;

int main(){
    map<string, int> word_count;
    string tmp;
    while (cin >> tmp){
        word_count[tmp] += 1;
    }
    for (const auto& elem : word_count)
		std::cout << elem.first << " : " << elem.second << endl;
	return 0;
}

练习11.4

扩展你的程序，忽略大小写和标点。例如，"example."、"example,"和"Example"应该递增相同的计数器。

解：

#include <iostream>
#include <map>
#include <string>
#include <algorithm>
#include <cctype>

void word_count_pro(std::map<std::string, int>& m)
{
	std::string word;
	while (std::cin >> word)
	{
		for (auto& ch : word) 
			ch = tolower(ch);
		
		word.erase(std::remove_if(word.begin(), word.end(), ispunct),
			word.end());
		++m[word];
	}
	for (const auto& e : m) std::cout << e.first << " : " << e.second << "\n";
}

int main()
{
	std::map<std::string, int> m;
	word_count_pro(m);

	return 0;
}

关联容器概述

定义关联容器

需要指定元素类型。
列表初始化：
- map：map<string, int> word_count = {{"a", 1}, {"b", 2}};
- set：set<string> exclude = {"the", "a"};

练习11.5

解释map和set的区别。你如何选择使用哪个？

解：

map 是键值对，而 set 只有键没有值。当我需要存储键值对的时候使用 map，而只需要键的时候使用 set。

练习11.6

解释set和list的区别。你如何选择使用哪个？

set 是有序不重复集合，底层实现是红黑树，而 list 是无序可重复集合，底层实现是链表。

练习11.7

定义一个map，关键字是家庭的姓，值是一个vector，保存家中孩子（们）的名。编写代码，实现添加新的家庭以及向已有家庭中添加新的孩子。

解：

map<string, vector<string>> m;
for (string ln; cout << "Last name:\n", cin >> ln && ln != "@q";)
	for (string cn; cout << "|-Children's names:\n", cin >> cn && cn != "@q";)
		m[ln].push_back(cn);

练习11.8

编写一个程序，在一个vector而不是一个set中保存不重复的单词。使用set的优点是什么？

解：

#include <iostream>
#include <string>
#include <vector>
#include <algorithm>

int main()
{
	std::vector<std::string> exclude = { "aa", "bb", "cc", "dd", "ee", "ff" };
	for (std::string word; std::cout << "Enter plz:\n", std::cin >> word;)
	{
		auto is_excluded = std::binary_search(exclude.cbegin(), exclude.cend(), word);
		auto reply = is_excluded ? "excluded" : "not excluded";
		std::cout << reply << std::endl;
	}
	return 0;
}

set的优点是集合本身的元素就是不重复。

关键字类型的要求

对于有序容器，关键字类型必须定义元素比较的方法。默认是<。
如果想传递一个比较的函数，可以这样定义：multiset<Sales_data, decltype(compareIsbn)*> bookstore(compareIsbn);

练习11.9

定义一个map，将单词与一个行号的list关联，list中保存的是单词所出现的行号。

解：

std::map<std::string, std::list<std::size_t>> m;

练习11.10

可以定义一个vector<int>::iterator 到 int 的map吗？list<int>::iterator 到 int 的map呢？对于两种情况，如果不能，解释为什么。

解：

可以定义 vector<int>::iterator 到 int 的map，但是不能定义 list<int>::iterator 到 int 的map。因为map的键必须实现 < 操作，list 的迭代器不支持比较运算。

练习11.11

不使用decltype 重新定义 bookstore。

解：

using Less = bool (*)(Sales_data const&, Sales_data const&);
std::multiset<Sales_data, Less> bookstore(less);

pair

在utility头文件中定义。
一个pair保存两个数据成员，两个类型不要求一样。

pair的操作：

操作	解释
`pair<T1, T2> p;`	`p`是一个`pair`，两个类型分别是`T1`和`T2`的成员都进行了值初始化。
`pair<T1, T2> p(v1, v2);`	`first`和`second`分别用`v1`和`v2`进行初始化。
`pair<T1, T2>p = {v1, v2};`	等价于`p(v1, v2)
`make_pair(v1, v2);`	`pair`的类型从`v1`和`v2`的类型推断出来。
`p.first`	返回`p`的名为`first`的数据成员。
`p.second`	返回`p`的名为`second`的数据成员。
`p1 relop p2`	运算关系符按字典序定义。
`p1 == p2`	必须两对元素两两相等
`p1 != p2`	同上

练习11.12

编写程序，读入string和int的序列，将每个string和int存入一个pair 中，pair保存在一个vector中。

解：

#include <vector>
#include <utility>
#include <string>
#include <iostream>

int main()
{
	std::vector<std::pair<std::string, int>> vec;
	std::string str;
	int i;
	while (std::cin >> str >> i)
		vec.push_back(std::pair<std::string, int>(str, i));

	for (const auto &p : vec)
		std::cout << p.first << ":" << p.second << std::endl;
}

练习11.13

在上一题的程序中，至少有三种创建pair的方法。编写此程序的三个版本，分别采用不同的方法创建pair。解释你认为哪种形式最易于编写和理解，为什么？

解：

vec.push_back(std::make_pair(str, i));
vec.push_back({ str, i });
vec.push_back(std::pair<string, int>(str, i));

使用花括号的初始化器最易于理解和编写。

练习11.14

扩展你在11.2.1节练习中编写的孩子姓达到名的map，添加一个pair的vector，保存孩子的名和生日。

解：

#include <iostream>
#include <map>
#include <string>
#include <vector>

using std::ostream;
using std::cout;
using std::cin;
using std::endl;
using std::string;
using std::make_pair;
using std::pair;
using std::vector;
using std::map;

class Families
{
public:
	using Child = pair<string, string>;
	using Children = vector<Child>;
	using Data = map<string, Children>;

	void add(string const& last_name, string const& first_name, string birthday)
	{
		auto child = make_pair(first_name, birthday);
		_data[last_name].push_back(child);
	}

	void print() const
	{
		for (auto const& pair : _data)
		{
			cout << pair.first << ":\n";
			for (auto const& child : pair.second)
				cout << child.first << " " << child.second << endl;
			cout << endl;
		}
	}

private:
	Data _data;
};

int main()
{
	Families families;
	auto msg = "Please enter last name, first name and birthday:\n";
	for (string l, f, b; cout << msg, cin >> l >> f >> b; families.add(l, f, b));
	families.print();

	return 0;
}

关联容器操作

关联容器额外的类型别名：

类型别名	解释
`key_type`	此容器类型的关键字类型
`mapped_type`	每个关键字关联的类型，只适用于`map`
`value_type`	对于`map`，是`pair<const key_type, mapped_type>`; 对于`set`，和`key_type`相同。

关联容器迭代器

解引用一个关联容器迭代器时，会得到一个类型为容器的value_type的值的引用。
set的迭代器是const的。
遍历关联容器：使用begin和end，遍历map、multimap、set、multiset时，迭代器按关键字升序遍历元素。

练习11.15

对一个int到vector<int>的map，其mapped_type、key_type和 value_type分别是什么？

解：

mapped_type : vector<int>
key_type : int
value_type : std::pair<const int,vector >

练习11.16

使用一个map迭代器编写一个表达式，将一个值赋予一个元素。

解：

std::map<int, string>::iterator it = m.begin();
it->second = "hello";

练习11.17

假定c是一个string的multiset，v 是一个string 的vector，解释下面的调用。指出每个调用是否合法：

copy(v.begin(), v.end(), inserter(c, c.end()));
copy(v.begin(), v.end(), back_inserter(c));
copy(c.begin(), c.end(), inserter(v, v.end()));
copy(c.begin(), c.end(), back_inserter(v));

解：

第二个调用不合法，因为 multiset 没有 push_back 方法。其他调用都合法。

练习11.18

写出第382页循环中map_it 的类型，不要使用auto 或 decltype。

解：

map<string, size_t>::const_iterator map_it = word_count.cbegin();

练习11.19

定义一个变量，通过对11.2.2节中的名为 bookstore 的multiset 调用begin()来初始化这个变量。写出变量的类型，不要使用auto 或 decltype。

解：

using compareType = bool (*)(const Sales_data &lhs, const Sales_data &rhs);
std::multiset<Sales_data, compareType> bookstore(compareIsbn);
std::multiset<Sales_data, compareType>::iterator c_it = bookstore.begin();

添加元素

关联容器insert操作：

`insert`操作	关联容器
`c.insert(v)` `c.emplace(args)`	`v`是`value_type`类型的对象；`args`用来构造一个元素。对于`map`和`set`，只有元素的关键字不存在`c`中才插入或构造元素。函数返回一个`pair`，包含一个迭代器，指向具有指定关键字的元素，以及一个指示插入是否成功的`bool`值。对于`multimap`和`multiset`则会插入范围中的每个元素。
`c.insert(b, e)` `c.insert(il)`	`b`和`e`是迭代器，表示一个`c::value_type`类型值的范围；`il`是这种值的花括号列表。函数返回`void`。对于 `map`和`set`，只插入关键字不在`c`中的元素。
`c.insert(p, v)` `c.emplace(p, args)`	类似`insert(v)`，但将迭代器`p`作为一个提示，指出从哪里开始搜索新元素应该存储的位置。返回一个迭代器，指向具有给定关键字的元素。

向map添加元素：

word_count.insert({word, 1});
word_count.insert(make_pair(word, 1));
word_count.insert(pair<string, size_t>(word, 1));
word_count.insert(map<string, size_t>::value_type (word, 1));

练习11.20

重写11.1节练习的单词计数程序，使用insert代替下标操作。你认为哪个程序更容易编写和阅读？解释原因。

解：

#include <iostream>
#include <map>
#include <string>

using std::string;
using std::map;
using std::cin;
using std::cout;

int main()
{
	map<string, size_t> counts;
	for (string word; cin >> word;)
	{
		auto result = counts.insert({ word, 1 });
		if (!result.second)
			++result.first->second;
	}
	for (auto const& count : counts)
		cout << count.first << " " << count.second << ((count.second > 1) ? " times\n" : " time\n");

	return 0;
}

使用insert更容易阅读和编写。insert有返回值，可以明确的体现出插入操作的结果。

练习11.21

假定word_count是一个string到size_t的map，word是一个string，解释下面循环的作用：

while (cin >> word)
	++word_count.insert({word, 0}).first->second;

解：

这条语句等价于：

while (cin >> word)
{
	auto result = word_count.insert({word, 0});
	++(result.first->second);
}

若insert成功：先添加一个元素，然后返回一个 pair，pair 的 first元素是一个迭代器。这个迭代器指向刚刚添加的元素，这个元素是pair，然后递增pair的second成员。
若insert失败：递增已有指定关键字的元素的 second成员。

练习11.22

给定一个map<string, vector<int>>，对此容器的插入一个元素的insert版本，写出其参数类型和返回类型。

std::pair<std::string, std::vector<int>>    // 参数类型
std::pair<std::map<std::string, std::vector<int>>::iterator, bool> // 返回类型

练习11.23

11.2.1节练习中的map 以孩子的姓为关键字，保存他们的名的vector，用multimap 重写此map。

解：

#include <map>
#include <string>
#include <iostream>

using std::string;
using std::multimap;
using std::cin;
using std::endl;

int main()
{
    multimap<string, string> families;
    for (string lname, cname; cin >> cname >> lname; families.emplace(lname, cname));
    for (auto const& family : families)
        std::cout << family.second << " " << family.first << endl;
}

删除元素

从关联容器中删除元素：

操作	解释
`c.erase(k)`	从`c`中删除每个关键字为`k`的元素。返回一个`size_type`值，指出删除的元素的数量。
`c.erase(p)`	从`c`中删除迭代器`p`指定的元素。`p`必须指向`c`中一个真实元素，不能等于`c.end()`。返回一个指向`p`之后元素的迭代器，若`p`指向`c`中的尾元素，则返回`c.end()`
`c.erase(b, e)`	删除迭代器对`b`和`e`所表示范围中的元素。返回`e`。

下标操作

map和unordered_map的下标操作：

操作	解释
`c[k]`	返回关键字为`k`的元素；如果`k`不在`c`中，添加一个关键字为`k`的元素，对其值初始化。
`c.at(k)`	访问关键字为`k`的元素，带参数检查；若`k`不存在在`c`中，抛出一个`out_of_range`异常。

练习11.24

下面的程序完成什么功能？

map<int, int> m;
m[0] = 1;

解：

添加一个元素到map中，如果该键存在，则重新赋值。

练习11.25

对比下面的程序与上一题程序

vector<int> v;
v[0] = 1;

解：

未定义行为，vector 的下标越界访问。

练习11.26

可以用什么类型来对一个map进行下标操作？下标运算符返回的类型是什么？请给出一个具体例子——即，定义一个map，然后写出一个可以用来对map进行下标操作的类型以及下标运算符将会返会的类型。

解：

std::map<int, std::string> m = { { 1,"ss" },{ 2,"sz" } };
using KeyType = std::map<int, std::string>::key_type;	
using ReturnType = std::map<int, std::string>::mapped_type;

查找元素

在一个关联容器中查找元素:

操作	解释
`c.find(k)`	返回一个迭代器，指向第一个关键字为`k`的元素，若`k`不在容器中，则返回尾后迭代器
`c.count(k)`	返回关键字等于`k`的元素的数量。对于不允许重复关键字的容器，返回值永远是0或1。
`c.lower_bound(k)`	返回一个迭代器，指向第一个关键字不小于`k`的元素。
`c.upper_bound(k)`	返回一个迭代器，指向第一个关键字大于`k`的元素。
`c.equal_range(k)`	返回一个迭代器`pair`，表示关键字等于`k`的元素的范围。若`k`不存在，`pair`的两个成员均等于`c.end()`。

lower_bound和upper_bound不适用于无序容器。
下标和at操作只适用于非const的map和unordered_map。

练习11.27

对于什么问题你会使用count来解决？什么时候你又会选择find呢？

解：

对于允许重复关键字的容器，应该用 count ; 对于不允许重复关键字的容器，应该用 find 。

练习11.28

对一个string到int的vector的map，定义并初始化一个变量来保存在其上调用find所返回的结果。

解：

map<string, vector<int>> m;
map<string, vector<int>>::iterator it = m.find("key");

练习11.29

如果给定的关键字不在容器中，upper_bound、lower_bound 和 equal_range 分别会返回什么？

解：

如果给定的关键字不在容器中，则 lower_bound和 upper_bound 会返回相等的迭代器，指向一个不影响排序的关键字插入位置。而equal_range 会返回一个 pair，pair 中的两个迭代器都指向关键字可以插入的位置。

练习11.30

对于本节最后一个程序中的输出表达式，解释运算对象pos.first->second的含义。

解：

pos 是一个pair，pos.first 是一个迭代器，指向匹配关键字的元素，该元素是一个 pair，访问该元素的第二个成员。

练习11.31

编写程序，定义一个作者及其作品的multimap。使用find在multimap中查找一个元素并用erase删除它。确保你的程序在元素不在map 中时也能正常运行。

解：

#include <map>
#include <string>
#include <iostream>

using std::string;

int main()
{
	std::multimap<string, string> authors{
		{ "alan", "DMA" },
		{ "pezy", "LeetCode" },
		{ "alan", "CLRS" },
		{ "wang", "FTP" },
		{ "pezy", "CP5" },
		{ "wang", "CPP-Concurrency" } };

	string author = "pezy";
	string work = "CP5";

	auto found = authors.find(author);
	auto count = authors.count(author);
	while (count)
	{
		if (found->second == work)
		{
			authors.erase(found);
			break;
		}
		++found;
		--count;
	}

	for (const auto &author : authors)
		std::cout << author.first << " " << author.second << std::endl;

	return 0;
}

练习11.32

使用上一题定义的multimap编写一个程序，按字典序打印作者列表和他们的作品。

解：

#include <map>
#include <set>
#include <string>
#include <iostream>

using std::string;

int main()
{
	std::multimap<string, string> authors{
		{ "alan", "DMA" },
		{ "pezy", "LeetCode" },
		{ "alan", "CLRS" },
		{ "wang", "FTP" },
		{ "pezy", "CP5" },
		{ "wang", "CPP-Concurrency" } };
	std::map<string, std::multiset<string>> order_authors;

	for (const auto &author : authors)
		order_authors[author.first].insert(author.second);

	for (const auto &author : order_authors)
	{
		std::cout << author.first << ": ";
		for (const auto &work : author.second)
			std::cout << work << " ";
		std::cout << std::endl;
	}

	return 0;
}

一个单词转换的map

练习11.33

实现你自己版本的单词转换程序。

解：

#include <map>
#include <string>
#include <fstream> 
#include <iostream>
#include <sstream>

using std::string; using std::ifstream;

std::map<string, string> buildMap(ifstream &map_file)
{
    std::map<string, string> trans_map;
    for (string key, value; map_file >> key && getline(map_file, value); )
        if (value.size() > 1) trans_map[key] = value.substr(1).substr(0, value.find_last_not_of(' '));
    return trans_map;
}

const string & transform(const string &s, const std::map<string, string> &m)
{
    auto map_it = m.find(s);
    return map_it == m.cend() ? s : map_it->second;
}

void word_transform(ifstream &map, ifstream &input)
{
    auto trans_map = buildMap(map);
    for (string text; getline(input, text); ) {
        std::istringstream iss(text);
        for (string word; iss >> word; )
            std::cout << transform(word, trans_map) << " ";
        std::cout << std::endl;
    }
}

int main()
{
    ifstream ifs_map("../data/word_transformation_bad.txt"), ifs_content("../data/given_to_transform.txt");
    if (ifs_map && ifs_content) word_transform(ifs_map, ifs_content);
    else std::cerr << "can't find the documents." << std::endl;
}

练习11.34

如果你将transform 函数中的find替换为下标运算符，会发生什么情况？

解：

如果使用下标运算符，当关键字未在容器中时，会往容器中添加一个新元素。

练习11.35

在buildMap中，如果进行如下改写，会有什么效果？

trans_map[key] = value.substr(1);
//改为
trans_map.insert({key, value.substr(1)});

解：

当一个转换规则的关键字多次出现的时候，使用下标运算符会保留最后一次添加的规则，而用insert则保留第一次添加的规则。

练习11.36

我们的程序并没检查输入文件的合法性。特别是，它假定转换规则文件中的规则都是有意义的。如果文件中的某一行包含一个关键字、一个空格，然后就结束了，会发生什么？预测程序的行为并进行验证，再与你的程序进行比较。

解：

如果关键字没有对应的规则，那么程序会抛出一个 runtime_error。

无序容器

有序容器使用比较运算符来组织元素；无序容器使用哈希函数和关键字类型的==运算符。
理论上哈希技术可以获得更好的性能。
无序容器在存储上组织为一组桶(bucket)，每个桶保存零个或多个元素。无序容器使用一个哈希函数将元素映射到桶。

无序容器管理操作：

操作	解释
桶接口
`c.bucket_count()`	正在使用的桶的数目
`c.max_bucket_count()`	容器能容纳的最多的桶的数目
`c.bucket_size(n)`	第`n`个桶中有多少个元素
`c.bucket(k)`	关键字为`k`的元素在哪个桶中
桶迭代
`local_iterator`	可以用来访问桶中元素的迭代器类型
`const_local_iterator`	桶迭代器的`const`版本
`c.begin(n)`，`c.end(n)`	桶`n`的首元素迭代器
`c.cbegin(n)`，`c.cend(n)`	与前两个函数类似，但返回`const_local_iterator`。
哈希策略
`c.load_factor()`	每个桶的平均元素数量，返回`float`值。
`c.max_load_factor()`	`c`试图维护的平均比桶大小，返回`float`值。`c`会在需要时添加新的桶，以使得`load_factor<=max_load_factor`
`c.rehash(n)`	重组存储，使得`bucket_count>=n`，且`bucket_count>size/max_load_factor`
`c.reverse(n)`	重组存储，使得`c`可以保存`n`个元素且不必`rehash`。

练习11.37

一个无序容器与其有序版本相比有何优势？有序版本有何优势？

无序容器拥有更好的性能，有序容器使得元素始终有序。

练习11.38

用 unordered_map 重写单词计数程序和单词转换程序。

解：

#include <unordered_map>
#include <set>
#include <string>
#include <iostream>
#include <fstream>
#include <sstream>

using std::string;

void wordCounting()
{
    std::unordered_map<string, size_t> word_count;
    for (string word; std::cin >> word; ++word_count[word]);
    for (const auto &w : word_count)
        std::cout << w.first << " occurs " << w.second << (w.second > 1 ? "times" : "time") << std::endl;
}

void wordTransformation()
{
    std::ifstream ifs_map("../data/word_transformation.txt"), ifs_content("../data/given_to_transform.txt");
    if (!ifs_map || !ifs_content) {
        std::cerr << "can't find the documents." << std::endl;
        return;
    }
    
    std::unordered_map<string, string> trans_map;
    for (string key, value; ifs_map >> key && getline(ifs_map, value); )
        if (value.size() > 1) trans_map[key] = value.substr(1).substr(0, value.find_last_not_of(' '));
    
    for (string text, word; getline(ifs_content, text); std::cout << std::endl)
        for (std::istringstream iss(text); iss >> word; ) {
            auto map_it = trans_map.find(word);
            std::cout << (map_it == trans_map.cend() ? word : map_it->second) << " ";
        }
}

int main()
{
    //wordCounting();
    wordTransformation();
}

posted @ 2023-02-19 23:18 nullptrException 阅读(58) 评论(0) 收藏举报

刷新页面返回顶部

Epiephany

第十一章 关联容器

第十一章 关联容器

练习11.1

练习11.2

练习11.3

练习11.4

关联容器概述

定义关联容器

练习11.5

练习11.6

练习11.7

练习11.8

关键字类型的要求

练习11.9

练习11.10

练习11.11

pair

练习11.12

练习11.13

练习11.14

关联容器操作

关联容器迭代器

练习11.15

练习11.16

练习11.17

练习11.18

练习11.19

添加元素

练习11.20

练习11.21

练习11.22

练习11.23

删除元素

下标操作

练习11.24

练习11.25

练习11.26

查找元素

练习11.27

练习11.28

练习11.29

练习11.30

练习11.31

练习11.32

一个单词转换的map

练习11.33

练习11.34

练习11.35

练习11.36

无序容器

练习11.37

练习11.38

公告

第十一章关联容器

第十一章关联容器