Python之zlib模块的使用

zlib模块作用:
  压缩数据存放在硬盘或内存等设备

1、内存中数据的压缩与解压

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import zlib
import binascii

original_data = b'This is the original text.'
print('源始数据:长度 : {},内容 : {}'.format(len(original_data), original_data))

#压缩数据
compressed_data = zlib.compress(original_data)
print('压缩的数据:长度 : {},内容 : {}'.format(len(compressed_data), binascii.hexlify(compressed_data))) #binascii.hexlify主要作用是将字节类转为16进制显示

#解压数据
decompress_data = zlib.decompress(compressed_data)
print('压缩的数据:长度 : {},内容 : {}'.format(len(decompress_data), decompress_data))
zlib_memory.py

运行效果

[root@ mnt]# python3 zlib_memory.py 
源始数据:长度 : 26,内容 : b'This is the original text.'
压缩的数据:长度 : 32,内容 : b'789c0bc9c82c5600a2928c5485fca2ccf4ccbcc41c8592d48a123d007f2f097e' #小文件压缩未必减少文件或内存的大小
压缩的数据:长度 : 26,内容 : b'This is the original text.'

 2、计算出大小达到多少时进行压缩才有用的示例

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import zlib
import binascii

original_data = b'This is the original text.'

template = '{:>15}  {:>15}'
print(template.format('原始长度', '压缩长度'))
print(template.format('-' * 25, '-' * 25))

for i in range(5):
    data = original_data * i #数据倍增
    compressed = zlib.compress(data) #压缩数据
    highlight = '*' if len(data) < len(compressed) else '' #三目运算法,如果原始数据长度小于压缩的长度就显示*
    print(template.format(len(data), len(compressed)), highlight)
zlib_lengths.py

 运行效果

[root@ mnt]# python3 zlib_lengths.py 
           原始长度             压缩长度
-------------------------  -------------------------
              0                8 *
             26               32 * #从这里开始,压缩变得有优势
             52               35 
             78               35 
            104               36 

  3、设置压缩级别来进行压缩数据的示例

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import zlib
import binascii

original_data = b'This is the original text.' * 1024

template = '{:>15}  {:>15}'
print(template.format('压缩级别', '压缩大小'))
print(template.format('-' * 25, '-' * 25))

for i in range(0, 10):
    data = zlib.compress(original_data, i)  # 设置压缩级别进行压缩
    print(template.format(i, len(data)))
zlib_compresslevel.py

 运行效果

[root@python-mysql mnt]# python3 zlib_compresslevel.py 
           压缩级别             压缩大小
-------------------------  -------------------------
              0            26635
              1              215
              2              215
              3              215
              4              118
              5              118 <==推荐
              6              118 <==推荐
              7              118
              8              118
              9              118

   4、zlib增量压缩与解压

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import zlib
import binascii

compressor = zlib.compressobj(1)

with open('content.txt', 'rb') as input:
    while True:
        block = input.read(64)  # 每次读取64个字节
        if not block:
            break
        compressed = compressor.compress(block)
        if compressed:
            print('压缩数据: {}'.format(
                binascii.hexlify(compressed)))
        else:
            print('数据缓存中...')
    remaining = compressor.flush()  # 刷新返回压缩的数据
    print('Flushed: {}'.format(binascii.hexlify(remaining)))

#一次性解压数据,需要注意的是增量压缩,默认会把zlib压缩的头部信息去除,所以解压时需要带上789c
zlib_head = binascii.unhexlify('789c')
decompress_data = zlib.decompress(zlib_head + remaining)
print(decompress_data)
zlib_incremental.py

 运行效果

[root@ mnt]# python3 zlib_incremental.py 
压缩数据: b'7801'
数据缓存中...
数据缓存中...
数据缓存中...
数据缓存中...
数据缓存中...
Flushed: b'55904b6ac4400c44f73e451da0f129b20c2110c85e696b8c40ddedd167ce1f7915025a087daa9ef4be8c07e4f21c38962e834b800647435fd3b90747b2810eb9c4bbcc13ac123bded6e4bef1c91ee40d3c6580e3ff52aad2e8cb2eb6062dad74a89ca904cbb0f2545e0db4b1f2e01955b8c511cb2ac08967d228af1447c8ec72e40c4c714116e60cdef171bb6c0feaa255dff1c507c2c4439ec9605b7e0ba9fc54bae39355cb89fd6ebe5841d673c7b7bc68a46f575a312eebd220d4b32441bdc1b36ebf0aedef3d57ea4b26dd986dd39af57dfb05d32279de'

#解压的数据
b'Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec\negestas, enim et consectetuer ullamcorper, lectus ligula rutrum leo, a\nelementum elit tortor eu quam. Duis tincidunt nisi ut ante. Nulla\nfacilisi. Sed tristique eros eu libero. Pellentesque vel arcu. Vivamus\npurus orci, iaculis ac, suscipit sit amet, pulvinar eu,\nlacus.\n'

 5、压缩与未压缩数据混合在一起的解压示例

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import zlib

lorem = open('zlib_mixed.py', 'rb').read()
compressed = zlib.compress(lorem)

# 压缩数据和没有压缩拼接在一起
combined = compressed + lorem

# 创建一个压缩对象
decompressor = zlib.decompressobj()
decompressed = decompressor.decompress(combined)  # 这里只解压压缩的数据

decompressed_matches = decompressed == lorem
print('解压数据的匹配:', decompressed_matches)

unused_matches = decompressor.unused_data == lorem
print('使用不解压数据的匹配 :', unused_matches)
zlib_mixed.py

 运行效果

[root@ mnt]# python3 zlib_mixed.py 
解压数据的匹配: True
使用不解压数据的匹配 : True

 6、校验数据的完整性CRC32和adler32算法

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import zlib

data = open('test.py', 'rb').read()

cksum = zlib.adler32(data)
print('Adler32: {:12d}'.format(cksum))
print('       : {:12d}'.format(zlib.adler32(data, cksum)))

cksum = zlib.crc32(data)
print('CRC-32 : {:12d}'.format(cksum))
print('       : {:12d}'.format(zlib.crc32(data, cksum)))
zlib_checksums.py

 运行效果

[root@ mnt]# python3 zlib_checksums.py 
Adler32:   4272063592
       :    539822302
CRC-32 :   2072120480
       :   1894987964

  7、zlib网络传输压缩与解压数据的示例(示例最终会读取文件跟服务端传过来文件比较是否相等)

Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec
egestas, enim et consectetuer ullamcorper, lectus ligula rutrum leo, a
elementum elit tortor eu quam. Duis tincidunt nisi ut ante. Nulla
facilisi. Sed tristique eros eu libero. Pellentesque vel arcu. Vivamus
purus orci, iaculis ac, suscipit sit amet, pulvinar eu,
lacus.
content.txt
#!/usr/bin/env python
# -*- coding: utf-8 -*-

import socket
import logging
from io import BytesIO
import binascii

# 每次读取的块大小
import zlib

BLOCK_SIZE = 64

if __name__ == '__main__':
    logging.basicConfig(
        level=logging.DEBUG,
        format='%(name)s : %(message)s'
    )

    logger = logging.getLogger('Client')

    ip_port = ('127.0.0.1', 8000)
    logging.info('开始连接服务器:{}'.format(ip_port[0] + ':' + str(ip_port[1])))
    # 创建socket对象
    sk = socket.socket(family=socket.AF_INET, type=socket.SOCK_STREAM)

    # 连接服务器
    sk.connect(ip_port)

    # 服务端需要读取的文件名
    request_file = 'content.txt'
    logging.debug('发送文件名:{}'.format(request_file))
    sk.send(request_file.encode('utf-8'))

    # 接收服务端数据
    buffer = BytesIO()

    # 创建一个解压对象
    decompressor = zlib.decompressobj()

    while True:
        response = sk.recv(BLOCK_SIZE)
        if not response:
            break
        logger.debug('从服务端读取数据:{}'.format(binascii.hexlify(response)))
        to_decompress = decompressor.unconsumed_tail + response
        while to_decompress:
            decompressed = decompressor.decompress(to_decompress)
            if decompressed:
                logger.debug('解压数据:{}'.format(decompressed))
                buffer.write(decompressed)
                to_decompress = decompressor.unconsumed_tail
            else:
                logger.debug('缓存中...')
                to_decompress = None

    remainder = decompressor.flush()
    if remainder:
        logger.debug('刷新数据 {}'.format(remainder))
        buffer.write(remainder)

    # 获取所有的解压数据
    full_reponse = buffer.getvalue()
    read_file = open(request_file, 'rb').read()
    logger.debug('服务器传过来的文件与客户端读取的文件是否相等 : {}'.format(full_reponse == read_file))
    sk.close()
zlib_client.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-

import zlib
import socketserver
import logging
import binascii

# 每次读取的块大小
BLOCK_SIZE = 64

class ZlibRquestHandler(socketserver.BaseRequestHandler):
    logger = logging.getLogger('Server')

    def handle(self):
        # 创建一个压缩的对象
        compressor = zlib.compressobj(1)
        # 接收客户端传来的文件名
        filename = self.request.recv(1024)

        self.logger.debug('接收客户端数据,文件名 {}'.format(filename))

        with open(filename, 'rb') as rf:
            while True:
                block = rf.read(BLOCK_SIZE)
                if not block:
                    break
                self.logger.debug('读取文件内容:{}'.format(block))

                # 压缩数据
                compressed = compressor.compress(block)
                if compressed:
                    self.logger.debug('发送的十六进制:{}'.format(binascii.hexlify(compressed)))
                    self.request.send(compressed)
                else:
                    self.logger.debug('缓存中...')

            # 获取压缩缓存剩下的数据
            remaining = compressor.flush()

            while remaining:  # 循环结束条件,就是刷新压缩缓存的数据,直到空为止
                to_send = remaining[:BLOCK_SIZE]
                remaining = remaining[BLOCK_SIZE:]
                self.logger.debug('刷新缓存数据:{}'.format(binascii.hexlify(to_send)))
                self.request.send(to_send)
            return

if __name__ == '__main__':
    logging.basicConfig(
        level=logging.DEBUG,
        format='%(name)s : %(message)s'
    )
    ip_port = ('127.0.0.1', 8000)
    socketserver.TCPServer.allow_reuse_address = True
    server = socketserver.TCPServer(ip_port, ZlibRquestHandler)
    server.serve_forever()
zlib_server.py

 运行效果

[root@ mnt]# python3 zlib_server.py 
Server : 接收客户端数据,文件名 b'content.txt'
Server : 读取文件内容:b'Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec\n'
Server : 发送的十六进制:b'7801'
Server : 读取文件内容:b'egestas, enim et consectetuer ullamcorper, lectus ligula rutrum '
Server : 缓存中...
Server : 读取文件内容:b'leo, a\nelementum elit tortor eu quam. Duis tincidunt nisi ut ant'
Server : 缓存中...
Server : 读取文件内容:b'e. Nulla\nfacilisi. Sed tristique eros eu libero. Pellentesque ve'
Server : 缓存中...
Server : 读取文件内容:b'l arcu. Vivamus\npurus orci, iaculis ac, suscipit sit amet, pulvi'
Server : 缓存中...
Server : 读取文件内容:b'nar eu,\nlacus.\n\n'
Server : 缓存中...
Server : 刷新缓存数据:b'55904b4a05410c45e7b58abb80a257e15044109cc7eaf808a4aada7cdefa4d8f44c820e473ef495eb7f1845c9e13e7d66d7009d0e4e8187b398fe04836d02997'
Server : 刷新缓存数据:b'f890f500abc48197bd78347eb00779072f99e0f8bf94aa34c7b68bad434b2b1d2a8f54826558792aef0e6aac3c7945156e71c4b60a70e2276996578a23640d39'
Server : 刷新缓存数据:b'730596b8200b73051f78bb5dda370dd1aa1ff8e01361e2213fc960db7e0ba97c557ae09d55cb89fd6e3e594136f2c0a73c69a6b72bad18b70de9101a5992a0d1'
Server : 刷新缓存数据:b'e159b75f85f6f79e2bf5298b6eccdeb466fd68ed174d1979e8'
[root@p mnt]# python zlib_client.py 
root : 开始连接服务器:127.0.0.1:8000
root : 发送文件名:content.txt
Client : 从服务端读取数据:780155904b4a05410c45e7b58abb80a257e15044109cc7eaf808a4aada7cdefa4d8f44c820e473ef495eb7f1845c9e13e7d66d7009d0e4e8187b398fe04836d0
Client : 解压数据:Lorem ipsum dolor sit amet, consectetuer a
Client : 从服务端读取数据:2997f890f500abc48197bd78347eb00779072f99e0f8bf94aa34c7b68bad434b2b1d2a8f54826558792aef0e6aac3c7945156e71c4b60a70e2276996578a2364
Client : 解压数据:dipiscing elit. Donec
egestas, enim et consectetuer ullamcorper, lectus ligula rutrum leo, a
elementum elit tortor eu quam. Duis ti
Client : 从服务端读取数据:0d39730596b8200b73051f78bb5dda370dd1aa1ff8e01361e2213fc960db7e0ba97c557ae09d55cb89fd6e3e594136f2c0a73c69a6b72bad18b70de9101a5992
Client : 解压数据:ncidunt nisi ut ante. Nulla
facilisi. Sed tristique eros eu libero. Pellentesque vel arcu. Vivamus
purus orci, iaculis
Client : 从服务端读取数据:a0d1e159b75f85f6f79e2bf5298b6eccdeb466fd68ed174d1979e8
Client : 解压数据: ac, suscipit sit amet, pulvinar eu,
lacus.


Client : 服务器传过来的文件与客户端读取的文件是否相等 : True
posted @ 2019-12-25 14:23  小粉优化大师  阅读(9621)  评论(0编辑  收藏  举报