字典
字典
key-value键值对的数据的集合,字典是可变的、无序的、同时key不重复。字典(dict)是python中唯一的一个映射类型.他是以{ }括起来的键值对组成. 在dict中key是唯一的. 在保存的时候, 根据key来计算出一个内存地址. 然后将key-value保存在这个地址中。这种算法被称为hash算法, 所以, 切记, 在dict中存储的key-value中的key'必须是可hash的。
已知的可哈希(不可变)的数据类型: int, str, tuple, bool
不可哈希(可变)的数据类型: list, dict, set
定义和初始化
使用{}或者dict()定义字典。
d = dict() print(type(d)) e = {} type(e) 结果为: <class 'dict'> dict
# 不合法
# dic = {[1, 2, 3]: '周杰伦'} # list是可变的. 不能作为key
# dic = {{1: 2}: "哈哈哈"} # dict是可变的. 不能作为key
dic = {{1, 2, 3}: '呵呵呵'} # set是可变的, 不能作为key
dict(**kwargs) 使用name=value对初始化一个字典,dict(iterable, **kwarg) 使用可迭代对象和name=value对构造字典,不过可迭代对象的元素必须是一个二元结构。
d = dict(((1,'a'),(2,'b'))) print(d) 结果为: {1: 'a', 2: 'b'} d = dict(([1,'a'],[2,'b'])) print(d) 结果为: {1: 'a', 2: 'b'}
e = enumerate(range(4))
d = dict(e)
print(d)
结果为:
{0: 0, 1: 1, 2: 2, 3: 3}
dict(mapping, **kwarg) 使用一个字典构建另一个字典。
d = {'a':10, 'b':20, 'c':None, 'd':[1,2,3]} 结果为: {'a': 10, 'b': 20, 'c': None, 'd': [1, 2, 3]}
类方法dict.fromkeys(iterable, value)
d = dict.fromkeys(range(5)) print(d) 结果为: {0: None, 1: None, 2: None, 3: None, 4: None} d = dict.fromkeys(range(5),0) print(d) 结果为: {0: 0, 1: 0, 2: 0, 3: 0, 4: 0}
#应该特别注意值的引用。
d = dict.fromkeys(range(5),[1,2])
print(d)
d[4].append(3)
print(d)
结果为:
{0: [1, 2], 1: [1, 2], 2: [1, 2], 3: [1, 2], 4: [1, 2]} {0: [1, 2, 3], 1: [1, 2, 3], 2: [1, 2, 3], 3: [1, 2, 3], 4: [1, 2, 3]}
字典元素的访问
d[key],返回key对应的值value,key不存在抛出KeyError异常。
d = {"a":1,"b":2,"c":3} print(d["b"]) print(d["d"]) 结果为: 2 --------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-24-98a066a8e141> in <module> 1 d = {"a":1,"b":2,"c":3} 2 print(d["b"]) ----> 3 print(d["d"]) KeyError: 'd'
get(key[, default]),返回key对应的值value,key不存在返回缺省值,如果没有设置缺省值就返回None。
dic = {"id": 123, "name": 'xpc', "age": 18} print(dic.get("id",456)) print(dic.get("idd",456)) print(dic.get("idd")) 结果为: 123 456 None
setdefault(key[, default]) ,返回key对应的值value,key不存在,添加kv对,value为default,并返回default,如果default没有设置,缺省为None 。如果dict中已经存在了. 那么setdefault将不会起作用。
dic = {"id": 123, "name": 'xpc', "age": 18} print(dic.setdefault("id")) print(dic.setdefault("sex","male"),dic) print(dic.setdefault("idd"),dic) dic.setdefault("id",123) print(dic) 结果为: 123 male {'id': 123, 'name': 'xpc', 'age': 18, 'sex': 'male'} None {'id': 123, 'name': 'xpc', 'age': 18, 'sex': 'male', 'idd': None} {'id': 123, 'name': 'xpc', 'age': 18, 'sex': 'male', 'idd': None}
字典的增加和修改
d[key] = value,将key对应的值修改为value,key不存在添加新的kv对。
d = {"a":1,"b":2,"c":3} d["a"] = 10 d["d"] = 5 print(d) 结果为: {'a': 10, 'b': 2, 'c': 3, 'd': 5}
update([other]) -> None,使用另一个字典的kv对更新本字典,key不存在,就添加,key存在,覆盖已经存在的key对应的值,它是就地修改。
c1 = {} c1.update(red=1) print(c1) 结果为: {'red': 1} c1.update((('red',2),)) print(c1) 结果为: {'red': 2} d1 = {"green":1} d1.update({'red':3}) print(d1) 结果为: {'green': 1, 'red': 3}
dic = {"id": 123, "name": 'xpc', "age": 18} dic1 = {"id": 456, "name": "xpcs", "ok": "wtf"} dic.update(dic1) # 把dic1中的内容更新到dic中. 如果key重名. 则修改替换. 如果不存在key, 则新增. print(dic) print(dic1) 结果为: {'id': 456, 'name': 'xpcs', 'age': 18, 'ok': 'wtf'} {'id': 456, 'name': 'xpcs', 'ok': 'wtf'}
字典删除
pop(key[, default]),key存在,移除它,并返回它的value,key不存在,返回给定的default,default未设置,key不存在则抛出KeyError异常。
dic = {"id": 123, "name": 'xpc', "age": 18} ret = dic.pop("id") print(ret) print(dic.pop("idd",-1)) print(dic.pop("idd")) 结果为: 123 -1 --------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-53-57358572fa03> in <module> 4 5 print(dic.pop("idd",-1)) ----> 6 print(dic.pop("idd")) KeyError: 'idd'
popitem(),移除并返回一个任意的键值对,字典为empty,抛出KeyError异常。
dic = {"id": 123, "name": 'xpc', "age": 18} ret = dic.popitem() print(ret) 结果为: ('age', 18) dic = {} print(dic.popitem()) 结果为: KeyError Traceback (most recent call last) <ipython-input-57-86a9c7ea7a09> in <module> 1 dic = {} ----> 2 print(dic.popitem()) KeyError: 'popitem(): dictionary is empty' dic = {"id": 123, "name": 'xpc', "age": 18} a,b = dic.popitem() print(a,b) 结果为: age 18
clear(),清空字典。
dic = {"id": 123, "name": 'xpc', "age": 18} dic.clear() print(dic) 结果为: {}
del 语句
a = True b = [6] d = {"a":1,"b":b,"c":[1,3,5]} del a a 结果为: --------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-15-23612501a197> in <module> 3 d = {"a":1,"b":b,"c":[1,3,5]} 4 del a ----> 5 a NameError: name 'a' is not defined del d["c"] print(d) 结果为: {'a': 1, 'b': [6]} del b[0] print(b,d) 结果为: [] {'a': 1, 'b': []} c = b print(b,c) print(d) 结果为: [] [] {'a': 1, 'b': []} del c print(b,d) print(c) 结果为: [] {'a': 1, 'b': []} --------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-21-c15658afec2c> in <module> 1 del c 2 print(b,d) ----> 3 print(c) NameError: name 'c' is not defined del b print(d) print(b) 结果为: {'a': 1, 'b': []} --------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-22-5aac62a9d7d7> in <module> 1 del b 2 print(d) ----> 3 print(b) NameError: name 'b' is not defined b = d["b"] print(b,d) 结果为: [] {'a': 1, 'b': []}
由以上例子可知道,del a['c'] 看着像删除了一个对象,本质上减少了一个对象的引用,del 实际上删除的是名称,而不是对象。
字典的遍历
遍历key,有两种方法,一种是for ……in dic,另一种是for……in dic.keys()
dic = {"id": 123, "name": 'xpc', "age": 18} for a in dic: print(a) 结果为: id name age dic = {"id": 123, "name": 'xpc', "age": 18} for a in dic.keys(): print(a) 结果为: id name age
遍历value,有三种方法,for k in d:print(d[k]) ,for k in d.keys():print(d.get(k)) ,for v in d.values():print(v)
dic = {"id": 123, "name": 'xpc', "age": 18} for a in dic: print(dic[a]) dic = {"id": 123, "name": 'xpc', "age": 18} for a in dic: print(dic.get(a)) dic = {"id": 123, "name": 'xpc', "age": 18} for a in dic.keys(): print(dic[a]) dic = {"id": 123, "name": 'xpc', "age": 18} for a in dic.keys(): print(dic.get(a)) dic = {"id": 123, "name": 'xpc', "age": 18} for a in dic.values(): print(a) 结果都为: 123 xpc 18
遍历item,即kv对。
dic = {"id": 123, "name": 'xpc', "age": 18} for item in dic.items(): print(item) 结果为: ('id', 123) ('name', 'xpc') ('age', 18) dic = {"id": 123, "name": 'xpc', "age": 18} for item in dic.items(): print(item[0], item[1]) 结果为: id 123 name xpc age 18 dic = {"id": 123, "name": 'xpc', "age": 18} for k,v in dic.items(): print(k, v) 结果为: id 123 name xpc age 18 dic = {"id": 123, "name": 'xpc', "age": 18} for k, _ in dic.items(): print(k) 结果为: id name age dic = {"id": 123, "name": 'xpc', "age": 18} for _ ,v in dic.items(): print(v) 结果为: 123 xpc 18
Python3中,keys、values、items方法返回一个类似一个生成器的可迭代对象,不会把函数的返回结果复制到内存中,Dictionary view对象,字典的entry的动态的视图,字典变化,视图将反映出这些变化。
Python2中,上面的方法会返回一个新的列表,占据新的内存空间。所以Python2建议使用iterkeys、itervalues、iteritems版本,返回一个迭代器,而不是一个copy。(不懂)
字典的遍历和移除
如何在遍历的时候移除元素,错误的做法是直接在遍历中pop。
#错误做法 d = dict(a=1, b=2, c='abc') print(d) for k,v in d.items(): d.pop(k) # 异常 {'a': 1, 'b': 2, 'c': 'abc'} --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-105-faeb0e8349bb> in <module> 1 d = dict(a=1, b=2, c='abc') 2 print(d) ----> 3 for k,v in d.items(): 4 d.pop(k) # 异常 RuntimeError: dictionary changed size during iteration d = dict(a=1, b=2, c='abc') while len(d): # 相当于清空,不如直接clear() print(d.popitem()) 结果为: ('c', 'abc') ('b', 2) ('a', 1) #正确的做法 d = dict(a=1, b=2, c='abc') keys = [] for k,v in d.items(): if isinstance(v, str): keys.append(k) for k in keys: d.pop(k) print(d) 结果为: {'a': 1, 'b': 2}
字典的key
key的要求和set的元素要求一致,set的元素可以就是看做key,set可以看做dict的简化版 ,hashable 可哈希才可以作为key,可以使用hash()测试 .
d = {1 : 0, 2.0 : 3, "abc" : None, ('hello', 'world', 'python') : "string", b'abc' : '135'} print(d) 结果为: {1: 0, 2.0: 3, 'abc': None, ('hello', 'world', 'python'): 'string', b'abc': '135'}
defaultdict
collections.defaultdict([default_factory[, ...]]) ,一个参数是default_factory,缺省是None,它提供一个初始化函数。当key不存在的时候,会调用这个工厂函数来生成key对应的value。
import random d1 = {} for k in 'abcdef': for i in range(random.randint(1,5)): if k not in d1.keys(): d1[k] = [] d1[k].append(i) print(d1) 结果为: {'a': [0], 'b': [0, 1, 2], 'c': [0, 1, 2, 3, 4], 'd': [0, 1], 'e': [0, 1, 2], 'f': [0]} from collections import defaultdict import random d1 = defaultdict(list) for k in 'abcdef': for i in range(random.randint(1,5)): d1[k].append(i) print(d1) 结果为: defaultdict(<class 'list'>, {'a': [0, 1, 2, 3, 4], 'b': [0, 1, 2, 3, 4], 'c': [0, 1, 2], 'd': [0, 1, 2], 'e': [0], 'f': [0, 1]})
OrderedDict
collections.OrderedDict([items]) ,key并不是按照加入的顺序排列,可以使用OrderedDict记录顺序
from collections import OrderedDict import random d = {'banana': 3, 'apple': 4, 'pear': 1, 'orange': 2} print(d) keys = list(d.keys()) random.shuffle(keys) print(keys) od = OrderedDict() for key in keys: od[key] = d[key] print(od) print(od.keys()) 结果为: {'banana': 3, 'apple': 4, 'pear': 1, 'orange': 2} ['pear', 'apple', 'banana', 'orange'] OrderedDict([('pear', 1), ('apple', 4), ('banana', 3), ('orange', 2)]) odict_keys(['pear', 'apple', 'banana', 'orange'])
有序字典可以记录元素插入的顺序,打印的时候也是按照这个顺序输出打印,3.6版本的Python的字典就是记录key插入的顺序(IPython不一定有效果)。
应用场景:
- 假如使用字典记录了N个产品,这些产品使用ID由小到大加入到字典中
- 除了使用字典检索的遍历,有时候需要取出ID,但是希望是按照输入的顺序,因为输入顺序是有序的
- 否则还需要重新把遍历到的值排序
练习1:用户输入一个数字,打印每一个数字及其重复的次数。
num = input("请输入一个数字:") d = {} for c in num: if c not in d.keys(): d[c]=1 else: d[c]+=1 print(d) num = input("请输入一个数字:") d = {} for c in num: if not d.get(c): d[c]=1 continue d[c]+=1 print(d) 结果为: 请输入一个数字:12343 {'1': 1, '2': 1, '3': 2, '4': 1}
数字重复统计:随机产生100个整数,数字的范围是【-1000,1000】,升序输出数字及其重复的次数。
import random n = 100 nums = [0]*n for i in range(n): nums[i] = random.randint(-1000,1000) print(nums) t=nums.copy() t.sort() print(t) d = {} for x in nums: if x not in d.keys(): d[x]=1 else: d[x]+=1 print(d) d1 = sorted(d.items()) print(d1) 结果为: [78, -325, 533, 125, 742, 0, 886, 31, -135, 775, -191, 798, -426, 415, -662, 958, -16, -159, 223, -163, 52, -455, -903, 584, -647, 152, -790, 150, -653, -811, 498, -462, -518, -512, 787, 613, -362, 510, 982, 239, -97, 326, 318, 624, 20, 793, -811, 149, -501, -18, -548, -444, -195, 722, 657, 798, -850, 398, 151, -91, -581, 763, -986, 185, -57, 103, -554, 707, 964, -675, 408, 207, -382, -312, -794, -857, -139, -293, 792, -987, 976, 127, -291, -906, 220, -733, -542, 701, -680, -981, 288, -868, -283, 432, -250, -916, 908, -625, 849, 872] [-987, -986, -981, -916, -906, -903, -868, -857, -850, -811, -811, -794, -790, -733, -680, -675, -662, -653, -647, -625, -581, -554, -548, -542, -518, -512, -501, -462, -455, -444, -426, -382, -362, -325, -312, -293, -291, -283, -250, -195, -191, -163, -159, -139, -135, -97, -91, -57, -18, -16, 0, 20, 31, 52, 78, 103, 125, 127, 149, 150, 151, 152, 185, 207, 220, 223, 239, 288, 318, 326, 398, 408, 415, 432, 498, 510, 533, 584, 613, 624, 657, 701, 707, 722, 742, 763, 775, 787, 792, 793, 798, 798, 849, 872, 886, 908, 958, 964, 976, 982] {78: 1, -325: 1, 533: 1, 125: 1, 742: 1, 0: 1, 886: 1, 31: 1, -135: 1, 775: 1, -191: 1, 798: 2, -426: 1, 415: 1, -662: 1, 958: 1, -16: 1, -159: 1, 223: 1, -163: 1, 52: 1, -455: 1, -903: 1, 584: 1, -647: 1, 152: 1, -790: 1, 150: 1, -653: 1, -811: 2, 498: 1, -462: 1, -518: 1, -512: 1, 787: 1, 613: 1, -362: 1, 510: 1, 982: 1, 239: 1, -97: 1, 326: 1, 318: 1, 624: 1, 20: 1, 793: 1, 149: 1, -501: 1, -18: 1, -548: 1, -444: 1, -195: 1, 722: 1, 657: 1, -850: 1, 398: 1, 151: 1, -91: 1, -581: 1, 763: 1, -986: 1, 185: 1, -57: 1, 103: 1, -554: 1, 707: 1, 964: 1, -675: 1, 408: 1, 207: 1, -382: 1, -312: 1, -794: 1, -857: 1, -139: 1, -293: 1, 792: 1, -987: 1, 976: 1, 127: 1, -291: 1, -906: 1, 220: 1, -733: 1, -542: 1, 701: 1, -680: 1, -981: 1, 288: 1, -868: 1, -283: 1, 432: 1, -250: 1, -916: 1, 908: 1, -625: 1, 849: 1, 872: 1} [(-987, 1), (-986, 1), (-981, 1), (-916, 1), (-906, 1), (-903, 1), (-868, 1), (-857, 1), (-850, 1), (-811, 2), (-794, 1), (-790, 1), (-733, 1), (-680, 1), (-675, 1), (-662, 1), (-653, 1), (-647, 1), (-625, 1), (-581, 1), (-554, 1), (-548, 1), (-542, 1), (-518, 1), (-512, 1), (-501, 1), (-462, 1), (-455, 1), (-444, 1), (-426, 1), (-382, 1), (-362, 1), (-325, 1), (-312, 1), (-293, 1), (-291, 1), (-283, 1), (-250, 1), (-195, 1), (-191, 1), (-163, 1), (-159, 1), (-139, 1), (-135, 1), (-97, 1), (-91, 1), (-57, 1), (-18, 1), (-16, 1), (0, 1), (20, 1), (31, 1), (52, 1), (78, 1), (103, 1), (125, 1), (127, 1), (149, 1), (150, 1), (151, 1), (152, 1), (185, 1), (207, 1), (220, 1), (223, 1), (239, 1), (288, 1), (318, 1), (326, 1), (398, 1), (408, 1), (415, 1), (432, 1), (498, 1), (510, 1), (533, 1), (584, 1), (613, 1), (624, 1), (657, 1), (701, 1), (707, 1), (722, 1), (742, 1), (763, 1), (775, 1), (787, 1), (792, 1), (793, 1), (798, 2), (849, 1), (872, 1), (886, 1), (908, 1), (958, 1), (964, 1), (976, 1), (982, 1)]
字符串重复统计:字符表“abcdefghijklmnopqrstuvwsyz”,随机挑选2个字母组成字符串,共挑选100个,降序输出这100个字符串及重复的次数。
import random alphabet = "abcdefghijklmnopqrstuvwsyz" words = [] for _ in range(100): #words.append("".join(random.choice(alphabet) for _ in range(2) ))#生成器 #words.append(random.choice(alphabet)+random.choice(alphabet)) words.append("".join(random.sample(alphabet,2)))#随即采样 d = {} for x in words: d[x]= d.get(x,0)+1 print(d) d1 = sorted(d.items(),reverse=True) print(d1) 结果为: {'gz': 1, 'do': 1, 'rl': 1, 'bg': 1, 'wo': 2, 'hu': 1, 'sf': 2, 'uh': 1, 'ol': 1, 'ag': 1, 'gc': 2, 'dw': 1, 'to': 1, 'cp': 1, 'sq': 1, 'wp': 1, 'mh': 1, 'ot': 2, 'pw': 1, 'oa': 1, 'qm': 1, 'py': 2, 'fe': 1, 'nt': 1, 'jy': 1, 'of': 1, 'ca': 1, 'tf': 1, 'kw': 1, 'qs': 2, 'jq': 1, 'he': 1, 'ep': 1, 'qe': 1, 'em': 1, 'ua': 1, 'zw': 1, 'ye': 1, 've': 1, 'rw': 1, 'pr': 1, 'mz': 1, 'ov': 1, 'aw': 1, 'rm': 1, 'qg': 1, 'pj': 1, 'za': 2, 'hr': 1, 'si': 1, 'lc': 1, 'uw': 1, 'qa': 2, 'pn': 1, 'ue': 1, 'wg': 1, 'ae': 1, 'cy': 1, 'hp': 1, 'qu': 1, 'fy': 1, 'ds': 1, 'yu': 1, 'ek': 1, 'sn': 1, 'op': 1, 'eb': 1, 'aq': 2, 'jf': 1, 'qp': 1, 'rs': 1, 'ko': 1, 'th': 1, 'fs': 1, 'qi': 1, 'pq': 1, 'yg': 1, 'om': 1, 'ew': 1, 'lf': 1, 'ka': 1, 'zk': 1, 'bs': 1, 'fu': 1, 'ec': 1, 'zc': 1, 'gs': 1, 'lz': 1, 'qv': 1, 'fr': 1, 'oy': 1} [('zw', 1), ('zk', 1), ('zc', 1), ('za', 2), ('yu', 1), ('yg', 1), ('ye', 1), ('wp', 1), ('wo', 2), ('wg', 1), ('ve', 1), ('uw', 1), ('uh', 1), ('ue', 1), ('ua', 1), ('to', 1), ('th', 1), ('tf', 1), ('sq', 1), ('sn', 1), ('si', 1), ('sf', 2), ('rw', 1), ('rs', 1), ('rm', 1), ('rl', 1), ('qv', 1), ('qu', 1), ('qs', 2), ('qp', 1), ('qm', 1), ('qi', 1), ('qg', 1), ('qe', 1), ('qa', 2), ('py', 2), ('pw', 1), ('pr', 1), ('pq', 1), ('pn', 1), ('pj', 1), ('oy', 1), ('ov', 1), ('ot', 2), ('op', 1), ('om', 1), ('ol', 1), ('of', 1), ('oa', 1), ('nt', 1), ('mz', 1), ('mh', 1), ('lz', 1), ('lf', 1), ('lc', 1), ('kw', 1), ('ko', 1), ('ka', 1), ('jy', 1), ('jq', 1), ('jf', 1), ('hu', 1), ('hr', 1), ('hp', 1), ('he', 1), ('gz', 1), ('gs', 1), ('gc', 2), ('fy', 1), ('fu', 1), ('fs', 1), ('fr', 1), ('fe', 1), ('ew', 1), ('ep', 1), ('em', 1), ('ek', 1), ('ec', 1), ('eb', 1), ('dw', 1), ('ds', 1), ('do', 1), ('cy', 1), ('cp', 1), ('ca', 1), ('bs', 1), ('bg', 1), ('aw', 1), ('aq', 2), ('ag', 1), ('ae', 1)]