【python学习笔记】python-collections学习笔记

本博客参考了：
Python中collections模块
 collections --- 容器数据类型

ChainMap
Counter
deque
defaultdict
namedtuple()
OrderedDict
UserDict
UserList
UserString

这个模块实现了特定目标的容器，以提供Python标准内建容器 dict , list , set , 和 tuple 的替代选择。

ChainMap:类似字典(dict)的容器类，将多个映射集合到一个视图里面
Counter:字典的子类，提供了可哈希对象的计数功能
deque:类似列表(list)的容器，实现了在两端快速添加(append)和弹出(pop)
defaultdict:字典的子类，提供了一个工厂函数，为字典查询提供一个默认值
namedtuple():创建命名元组子类的工厂函数
OrderedDict:字典的子类，保存了他们被添加的顺序
UserDict:封装了字典对象，简化了字典子类化
UserList:封装了列表对象，简化了列表子类化
UserString:封装了列表对象，简化了字符串子类化

ChainMap

一个 ChainMap 类是为了将多个映射快速的链接到一起，这样它们就可以作为一个单元处理。

它通常比创建一个新字典和多次调用 update() 要快很多。

collections.ChainMap(*maps)

如果没有 maps 被指定，就提供一个默认的空字典，这样一个新链至少有一个映射。

from collections import ChainMap

d1 = {'apple':1,'banana':2}
d2 = {'orange':2,'apple':3,'pike':1}

combined_d = ChainMap(d1,d2)
reverse_combind_d = ChainMap(d2,d1)

print(combined_d) 
print(reverse_combind_d)

'''
逐项输出时，从第二个dict的第一项开始输出;
当输出到两个dict都有的key，输出且只输出第一个dict的value
'''
for k,v in combined_d.items():
    print(k,v)
print('------------------')    
for k,v in reverse_combind_d.items():
    print(k,v)

ChainMap({'apple': 1, 'banana': 2}, {'orange': 2, 'apple': 3, 'pike': 1})
ChainMap({'orange': 2, 'apple': 3, 'pike': 1}, {'apple': 1, 'banana': 2})
orange 2
apple 1
pike 1
banana 2
------------------
apple 3
banana 2
orange 2
pike 1

def collection_test2():
    import builtins
    from collections import ChainMap
    a = {"name": "leng"}
    b = {"age": 24}
    c = {"wife": "qian"}
    
    pylookup = ChainMap(a,b,c)
    
    print(pylookup)
    print(pylookup['age'],pylookup.maps)
    
    #upadte()只能更新第一个dict
    pylookup.update({"age": 25})
    print(pylookup)
    
    pylookup.update({"age": 28})
    print(pylookup)
    
    #更新各自代表的dict
    b['age'] = 26
    c['wife'] = "zheng"
    print(pylookup)
    print(type(pylookup.maps))
    
    #更新指定dict
    pylookup.maps[0]['age']=20
    pylookup.maps[1]['age']=22
    print(pylookup)
    
    print("-----------")
    d = {"name": "leng"}
    e = {"name":"123"}
    cm = ChainMap(d,e)
    
    print(cm)
    print(cm['name'])
    
collection_test2()

ChainMap({'name': 'leng'}, {'age': 24}, {'wife': 'qian'})
24 [{'name': 'leng'}, {'age': 24}, {'wife': 'qian'}]
ChainMap({'name': 'leng', 'age': 25}, {'age': 24}, {'wife': 'qian'})
ChainMap({'name': 'leng', 'age': 28}, {'age': 24}, {'wife': 'qian'})
ChainMap({'name': 'leng', 'age': 28}, {'age': 26}, {'wife': 'zheng'})
<class 'list'>
ChainMap({'name': 'leng', 'age': 20}, {'age': 22}, {'wife': 'zheng'})
-----------
ChainMap({'name': 'leng'}, {'name': '123'})
leng

Counter

Counter是一个dict子类，主要是用来对你访问的对象的频率进行计数，是一个计数器工具提供快速和方便的计数。
常用方法：

elements()：返回一个迭代器，每个元素重复计算的个数，如果一个元素的计数小于1,就会被忽略。
most_common([n])：返回一个列表，提供n个访问频率最高的元素和计数
subtract([iterable-or-mapping])：从迭代对象中减去元素，输入输出可以是0或者负数
update([iterable-or-mapping])：从迭代对象计数元素或者从另一个映射对象 (或计数器) 添加。

from collections import Counter

# 统计字符出现的次数
print(Counter('hello world'))

# 统计单词数
c = Counter('hello world hello world hello nihao'.split())
print(c)

cnt = Counter()
for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:
    cnt[word] += 1
print(cnt)

# 获取指定对象的访问次数，也可以使用get()方法
print(c['hello'])
print(c.get('hello'))

# 查看元素
print(list(c.elements()))
print(c.elements())

# 追加对象，或者使用c.update(d)
c = Counter('hello world hello world hello nihao'.split())
d = Counter('hello world'.split())
print(c)
print(d)
print(c + d)
c.update(d)
print(c)

#减少对象，或者使用c.subtract(d)
c = Counter('hello world hello world hello nihao'.split())
print(c - d)
c.subtract(d)
print(c)

# 清除
c.clear()
print(c)

Counter({'l': 3, 'o': 2, 'h': 1, 'e': 1, ' ': 1, 'w': 1, 'r': 1, 'd': 1})
Counter({'hello': 3, 'world': 2, 'nihao': 1})
Counter({'blue': 3, 'red': 2, 'green': 1})
3
3
['hello', 'hello', 'hello', 'world', 'world', 'nihao']
<itertools.chain object at 0x000001FC132A38C8>
Counter({'hello': 3, 'world': 2, 'nihao': 1})
Counter({'hello': 1, 'world': 1})
Counter({'hello': 4, 'world': 3, 'nihao': 1})
Counter({'hello': 4, 'world': 3, 'nihao': 1})
Counter({'hello': 2, 'world': 1, 'nihao': 1})
Counter({'hello': 2, 'world': 1, 'nihao': 1})
Counter()

deque

collections.deque返回一个新的双向队列对象，从左到右初始化(用方法 append()) ，从 iterable （迭代对象) 数据创建。如果 iterable 没有指定，新队列为空。

Deque队列是由栈或者queue队列生成的。Deque支持线程安全，对于从两端添加(append)或者弹出(pop)，复杂度O(1)。

虽然list对象也支持类似操作，但是这里优化了定长操作（pop(0)、insert(0,v)）的开销。

如果 maxlen 没有指定或者是 None ，deques 可以增长到任意长度。否则，deque就限定到指定最大长度。

一旦限定长度的deque满了，当新项加入时，同样数量的项就从另一端弹出。

支持的方法：

append(x)：添加x到右端
appendleft(x)：添加x到左端
clear()：清楚所有元素，长度变为0
copy()：创建一份浅拷贝
count(x)：计算队列中个数等于x的元素
extend(iterable)：在队列右侧添加iterable中的元素
extendleft(iterable)：在队列左侧添加iterable中的元素，注：在左侧添加时，iterable参数的顺序将会反过来添加
index(x[,start[,stop]])：返回 x 在 deque 中的位置（在索引 start 之后，索引 stop 之前）。返回第一个匹配项，如果未找到则引发 ValueError。
insert(i,x)：在位置 i 插入 x 。注：如果插入会导致一个限长deque超出长度 maxlen 的话，就升起一个 IndexError 。
pop()：移除最右侧的元素
popleft()：移除最左侧的元素
remove(value)：移去找到的第一个 value。没有抛出ValueError
reverse()：将deque逆序排列。返回 None 。
maxlen：队列的最大长度，没有限定则为None。

from collections import deque
d = deque(maxlen=20)
print(d)
d.extend('python')
print(d)

for elem in d:
    print(elem.upper())

deque([], maxlen=20)
deque(['p', 'y', 't', 'h', 'o', 'n'], maxlen=20)
P
Y
T
H
O
N

d.append('e')
print(d)

x = 'java'
d.appendleft(x)
print(d)

deque(['p', 'y', 't', 'h', 'o', 'n', 'e'], maxlen=20)
deque(['java', 'p', 'y', 't', 'h', 'o', 'n', 'e'], maxlen=20)

print(d.count('p'))

d.extendleft('cpp')
print(d)

deque(['p', 'p', 'c', 'java', 'p', 'y', 't', 'h', 'o', 'n', 'e'], maxlen=20)

x = 'java'
print(d.index(x))
print(d.index('p'))

3
0

d.insert(3,'a')
print(d)

deque(['p', 'p', 'c', 'a', 'java', 'p', 'y', 't', 'h', 'o', 'n', 'e'], maxlen=20)

d.pop()
print(d)

d.popleft()
print(d)

deque(['p', 'p', 'c', 'a', 'java', 'p', 'y', 't', 'h', 'o', 'n'], maxlen=20)
deque(['p', 'c', 'a', 'java', 'p', 'y', 't', 'h', 'o', 'n'], maxlen=20)

value = 'java'
d.remove(value)
print(d)

d.reverse()
print(d)

d.clear()
print(d)

deque(['p', 'c', 'a', 'p', 'y', 't', 'h', 'o', 'n'], maxlen=20)
deque(['n', 'o', 'h', 't', 'y', 'p', 'a', 'c', 'p'], maxlen=20)
deque([], maxlen=20)

defaultdict

返回一个新的类似字典的对象。

defaultdict 是内置 dict 类的子类。它重载了一个方法并添加了一个可写的实例变量。

其余的功能与 dict 类相同，此处不再重复说明。

本对象包含一个名为 default_factory 的属性，构造时，第一个参数用于为该属性提供初始值，默认为 None。

所有其他参数（包括关键字参数）都相当于传递给 dict 的构造函数。

from collections import defaultdict
d = defaultdict()
print(d)

#使用list做default_factory
s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
d = defaultdict(list)

#直接添加，如果要加value在[]后加.append(value)即可
d['hello']
print(d)
print(sorted(d.items()))

for k, v in s:
    d[k].append(v)

print(sorted(d.items()))

defaultdict(None, {})
defaultdict(<class 'list'>, {'hello': []})
[('hello', [])]
[('blue', [2, 4]), ('hello', []), ('red', [1]), ('yellow', [1, 3])]

#设置 default_factory 为 int，使 defaultdict 用于计数
s = 'mississippi'
d = defaultdict(int)
for k in s:
    d[k] += 1

print(sorted(d.items()))

[('i', 4), ('m', 1), ('p', 2), ('s', 4)]

namedtuple()

命名元组赋予每个位置一个含义，提供可读性和自文档性。

它们可以用于任何普通元组，并添加了通过名字获取值的能力，通过索引值也是可以的。

（看例子有点java的实体类的感觉）

from collections import namedtuple
Person = namedtuple('Person', ['age', 'height', 'name'])
Human = namedtuple('Human', 'age, height, name')
student = namedtuple('student', 'age height name')

tom = Person(30,178,'Tom')
jack = Human(20,179,'Jack')

print(jack)
print(tom)

print(tom.age)

print(jack.height)

Human(age=20, height=179, name='Jack')
Person(age=30, height=178, name='Tom')
30
179

OrderedDict

有序词典就像常规词典一样，但有一些与排序操作相关的额外功能。

由于内置的 dict 类获得了记住插入顺序的能力（在 Python 3.7 中保证了这种新行为），它们变得不那么重要了。

一些与 dict 的不同仍然存在：

常规的 dict 被设计为非常擅长映射操作。跟踪插入顺序是次要的。
OrderedDict 旨在擅长重新排序操作。空间效率、迭代速度和更新操作的性能是次要的。
算法上， OrderedDict 可以比 dict 更好地处理频繁的重新排序操作。这使其适用于跟踪最近的访问（例如在 LRU cache 中）。
对于 OrderedDict ，相等操作检查匹配顺序。
OrderedDict 类的 popitem() 方法有不同的签名。它接受一个可选参数来指定弹出哪个元素。
OrderedDict 类有一个 move_to_end() 方法，可以有效地将元素移动到任一端。
Python 3.8之前， dict 缺少 reversed() 方法。

from collections import OrderedDict

od = OrderedDict()
od['country'] = '1'
od['sex'] = '0'
od['age'] = '34'

print(od)

OrderedDict([('country', '1'), ('sex', '0'), ('age', '34')])

od.move_to_end('country')
print(od)

OrderedDict([('sex', '0'), ('age', '34'), ('country', '1')])

UserDict

是用作字典对象的外包装。对这个类的需求已部分由直接创建 dict 的子类的功能所替代；

不过，这个类处理起来更容易，因为底层的字典可以作为属性来访问。

UserList

封装了列表对象。它是一个有用的基础类，对于你想自定义的类似列表的类，可以继承和覆盖现有的方法，也可以添加新的方法。

这样我们可以对列表添加新的行为。

对这个类的需求已部分由直接创建 list 的子类的功能所替代；

不过，这个类处理起来更容易，因为底层的列表可以作为属性来访问。

UserString

类是用作字符串对象的外包装。对这个类的需求已部分由直接创建 str 的子类的功能所替代；

不过，这个类处理起来更容易，因为底层的字符串可以作为属性来访问。

posted @ 2021-02-19 23:29 ryukirin 阅读(42) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

ryukirin