Python 20 json、pickle、shelve序列化模块

序列化模块

什么是序列化------>将原本的字典、列表等内容转换成字符串的过程叫做序列化

为什么要序列化

数据存储
网络上传输的时候

从数据类型---> 字符串的过程  序列化
从字符串 ----->  数据类型的过程   反序列化

序列化模块有三种：json  pickle  shelve
json:
　　通用的序列化格式
　　只有很少一部分数据类型能够通过json转化为字符串
　　json转化的字符串最直观的能看懂
pickle:
　　所有的python中的数据类型都可以转化成字符串形式
　　pickle序列化的内容只有python能理解
　　且部分反序列化依赖代码
shelve:
　　序列化句柄
　　使用句柄直接操作，非常方便
jsos 有四种方法：dumps loads dump load
dumps是序列化方法，loads是反序列化方法。都是对内存直接操作的

dic = {'k1':'v1'}
json  dumps序列化方法  loads反序列化方法  : 对内存操作
print(type(dic),dic)
import json
str_d = json.dumps(dic)
print(type(str_d),str_d)

dic_d = json.loads(str_d)
print(type(dic_d),dic_d)

dumps loads

　　可以序列化的数据类型有:数字 字符串 列表  字典  元组(转成列表)  其他不行

json dump load  :对文件操作

import json
dic = {'a':1,'b':2}
f = open('fff','w',encoding='utf-8')
json.dump(dic,f)
f.close()

f = open('fff')
res = json.load(f)
f.close()
print(res,type(res))

dump load

import json
dic = {'a':'中国','b':'国家'}
f = open('fff','w',encoding='utf-8')
json.dump(dic,f,ensure_ascii=False)  # 这里加一个 ensure_ascii=False
f.close()

f = open('fff',encoding='utf-8')
res = json.load(f,)
f.close()
print(res,type(res))

有中文时

import json
l = [{'k':111},{'k':111},{'k':111}]
f = open('file','w')
for dic in l:
    str_dic = json.dumps(dic)
    f.write(str_dic+'\n')
f.close()

l = []
f = open('file')
for line in f:
   dic = json.loads(line.strip())
   l.append(dic)
f.close()
print(l)

对于文件分次写和读操作

Serialize obj to a JSON formatted str.(字符串表示的json对象) 
Skipkeys：默认值是False，如果dict的keys内的数据不是python的基本类型(str,unicode,int,long,float,bool,None)，设置为False时，就会报TypeError的错误。此时设置成True，则会跳过这类key 
ensure_ascii:，当它为True的时候，所有非ASCII码字符显示为\uXXXX序列，只需在dump时将ensure_ascii设置为False即可，此时存入json的中文即可正常显示。) 
If check_circular is false, then the circular reference check for container types will be skipped and a circular reference will result in an OverflowError (or worse). 
If allow_nan is false, then it will be a ValueError to serialize out of range float values (nan, inf, -inf) in strict compliance of the JSON specification, instead of using the JavaScript equivalents (NaN, Infinity, -Infinity). 
indent：应该是一个非负的整型，如果是0就是顶格分行显示，如果为空就是一行最紧凑显示，否则会换行且按照indent的数值显示前面的空白分行显示，这样打印出来的json数据也叫pretty-printed json 
separators：分隔符，实际上是(item_separator, dict_separator)的一个元组，默认的就是(‘,’,’:’)；这表示dictionary内keys之间用“,”隔开，而KEY和value之间用“：”隔开。 
default(obj) is a function that should return a serializable version of obj or raise TypeError. The default simply raises TypeError. 
sort_keys：将数据根据keys的值进行排序。 
To use a custom JSONEncoder subclass (e.g. one that overrides the .default() method to serialize additional types), specify it with the cls kwarg; otherwise JSONEncoder is used.

其他参数说明

import json
data = {'username':['李华','二愣子'],'sex':'male','age':16}
json_dic2 = json.dumps(data,sort_keys=True,indent=2,separators=(',',':'),ensure_ascii=False)
print(json_dic2)

json的格式化输出

pickle dumps loads dump load

可以任意类型，但反序列化也要这个代码

import pickle
dic = {'k1':'v1','k2':'v2','k3':'v3'}
str_dic = pickle.dumps(dic)
print(str_dic)  # 一串二进制内容

dic2 = pickle.loads(str_dic)
print(dic2)  #  字典

import time
struct_time = time.localtime(1000000000)
print(struct_time)
f = open('pick_file','wb')
pickle.dump(struct_time,f)
f.close()

f = open('pick_file','rb')
struct_time2 = pickle.load(f)
print(struct_time2.tm_year)
f.close()

pickle dumps loads dump load

import time
struct_time1 = time.localtime(1000000000)
struct_time2 = time.localtime(2000000000)
# print(struct_time)
f = open('pick_file','wb')
pickle.dump(struct_time1,f)
pickle.dump(struct_time2,f)
f.close()

f = open('pick_file','rb')
struct_time3 = pickle.load(f)
struct_time4 = pickle.load(f)
print(struct_time3.tm_year)
print(struct_time4.tm_year)
f.close()

对于文件的多次写入

shelve

shelve也是python提供给我们的序列化工具，比pickle用起来简单一些

shelve也只提供了一个open的方法，是用key来访问的，使用起来和字典类似

import shelve
f = shelve.open('shelve_file')
f['key'] = {'int':10,'float':9.5,'string':'sample data'} # 直接对文件句柄操作,就可以修改数据
f.close()

import shelve
f1 = shelve.open('shelve_file')
existing = f1['key']  # 取出数据时,只需要直接用key获取即可,但是如果key不存在,会报错
f1.close()
print(existing)

shelve

这个模块有个限制，它不支持多个应用同一时间往同一个DB进行写操作。所以当我们知道我们的应用如果只进行读操作，我们就让

shelve通过只读方式打开DB

第一种
import shelve
f = shelve.open('shelve_file',flag='c')
existing = f['key']
f.close()
print(existing)

第二种
import shelve
f = shelve.open('shelve_file',flag='r')
existing = f['key']
f['key'] = 10  # 这样就把key的值全改了
f.close()
print(existing)

f = shelve.open('shelve_file',flag='r')
existing2 = f['key']
f.close()
print(existing2)

shelve 只读

由于shelve在默认情况下是不会记录待持久化对象的任何修改，所以我们在shelve.open()时候需要修改默认参数，否则对象的修改不会保存

import shelve
f1 = shelve.open('shelve_file')
print(f1['key'])
f1['key']['new value'] = 'this was not here before'
f1.close()

import shelve
f1 = shelve.open('shelve_file',writeback=True)
print(f1['key'])
f1['key']['new value'] = 'this was not here before'
f1.close()

修改值里面的值

writeback方式有优点也有缺点。优点是减少了我们出错的概率，并且让对象的持久化对用户更加的透明;但这种并不是所有的情况下都需要，首先，使用writeback以后，shelve在open()时候会额外的增加内存消耗，并且当DB在close()的时候会将缓存中的每一个对象都写入到DB，这也会带来额外的待时间。因为shelve没有办法知道缓存中哪些对象修改了，哪些对象没有被修改，因此所有的对象都会被写入

posted @ 2019-09-15 13:14 休由阅读(193) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

休由

Python 20 json、pickle、shelve序列化模块

公告