python 数据压缩

zlib 压缩

import zlib
import this
s = this.s.encode('utf8')*10
for i in range(10):
    data = zlib.compress(s,i) #compress 接收两个参数分别是要压缩的字节和压缩等级。
    de_data = zlib.decompress(data) #解压缩
    print(f"data:{len(data)},s:{len(s)}")

结果如下:

data:8571,s:8560
data:562,s:8560
data:560,s:8560
data:558,s:8560
data:519,s:8560
data:511,s:8560 #可以看出压缩到极限以后无法在继续压缩
data:511,s:8560
data:511,s:8560
data:511,s:8560
data:511,s:8560 

 

这个压缩方法有一个明显的缺陷:需要有足够大的内存去存储待压缩数据和压缩后的数据。那我们是否可以每次压缩一部分呢,也是可以的

import zlib
import this
s = this.s*10
with open('a.txt','w') as t:
    t.write(s)
com = zlib.compressobj()
with open('a.txt', 'rb') as f:
    while True:
        a = f.read(64)
        if not a:
            break
        data = com.compress(a)
        if data:
            print(f"data:{len(data)}")
        else:
            print("doing....")
    result = com.flush()
    print(f"result:{len(result)}")
结果如下:
doing....
doing....
doing....
doing....
doing....
doing....
doing....
doing....
result:515

 

gzip 压缩数据
gzip 和 zlib都有compress和deconpress方法,用法也是一样的,说说文件的操作把
读取压缩文件示例

import gzip
with gzip.open('file.txt.gz', 'rb') as f:
    file_content = f.read()

 

创建压缩GZIP文件的示例:

import gzip
content = "Lots of content here"
with gzip.open('file.txt.gz', 'wb') as f:
    f.write(content)

GZIP压缩现有文件的示例:

import gzip
import shutil
with open('file.txt', 'rb') as f_in, gzip.open('file.txt.gz', 'wb') as f_out:
  shutil.copyfileobj(f_in, f_out)

bz2压缩

bz2.compress
bz2.decompress
基本与zlib一样不多说

 tarfile 压缩数据

如何将整个tar存档解压缩到当前工作目录:

import tarfile
tar = tarfile.open("sample.tar.gz")
tar.extractall()
tar.close()

如何TarFile.extractall()使用生成器函数而不是列表来提取tar存档的子集:

import os
import tarfile

def py_files(members):
for tarinfo in members:
  if os.path.splitext(tarinfo.name)[1] == ".py":
  yield tarinfo

tar = tarfile.open("sample.tar.gz")
tar.extractall(members=py_files(tar))
tar.close()

 

如何从文件名列表创建未压缩的tar存档:

import tarfile
tar = tarfile.open("sample.tar", "w")
for name in ["foo", "bar", "quux"]:
    tar.add(name)
tar.close()

使用with语句的相同示例:

import tarfile
with tarfile.open("sample.tar", "w") as tar:
  for name in ["foo", "bar", "quux"]:
    tar.add(name)

如何阅读gzip压缩的tar存档并显示一些成员信息:

import tarfile
tar = tarfile.open("sample.tar.gz", "r:gz")
for tarinfo in tar:
  print tarinfo.name, "is", tarinfo.size, "bytes in size and is",
  if tarinfo.isreg():
    print "a regular file."
  elif tarinfo.isdir():
    print "a directory."
  else:
    print "something else."
tar.close()

如何使用以下过滤器 参数创建存档并重置用户信息TarFile.add():

import tarfile
def reset(tarinfo):
  tarinfo.uid = tarinfo.gid = 0
  tarinfo.uname = tarinfo.gname = "root"
  return tarinfo
tar = tarfile.open("sample.tar.gz", "w:gz")
tar.add("foo", filter=reset)
tar.close()
posted @ 2019-07-25 17:31  君子不徒语  阅读(2130)  评论(0编辑  收藏  举报