深度学习--模型优化--模型的剪枝--92

使用TFLite创建一个压缩的模型
使用TFLite创建一个压缩+量化的模型
读取剪枝+量化的模型
- - 4. 结构化剪枝代码

1. 模型压缩

目的：使得模型体积更小，模型推理速度更快

评估指标：

Compression Ratio
压缩率 = 总参数量 / 非0参数量
原始网络参数量 / 优化后的网络模型中非0参数量

脱水前的重量 / 脱水后的重量

Theoretical Speedup
速度 = 总FLOPS / 非0 FLOPS
脱水前的浮点数运算量 / 脱水后的浮点数运算量

数据，模型，硬件等维度压缩和加速模型的方法。
1.压缩已有的网络，包含：张量分解，模型剪枝，模型量化；
（针对既有模型）
2.构建新的小型网络，包含：知识蒸馏，紧凑网络设计；（针
对新模型）

2. 神经网络剪枝

我们人从小也都经历过神经网络的剪枝

剪枝一词在决策树算法中，其实就已经听过。使用剪枝顾名思义就是将网络模型变得更加简单，顺带减少过拟合。

在训练期间删除连接
密集张量将变得稀疏（用零填充）
可以通过结构化块删除连接

好处：
减少过拟合
稀疏性优势
文件中有大量的0，如果有适当的稀疏张量表示方法，模型二进制文件尺寸减小。
模型更小，可以减少内存带宽消耗量。
对于特定模式的稀疏模型，可以开发优化算子，实现加速推理。

剪枝与dropout的区别：
剪枝并非 Dropout / Dropconnect
剪枝是根据权重的绝对值来选择去掉部分连接。
dropout是训练过程中随机丢弃某些神经元，让每个神经元都能学到知识。

剪枝改变权重张量，不改变激活张量
激活张量是实际推理神经元的输出值，
权重张量是已经落盘的神经元之间的连接权重，推理过程是不变的

不同粒度的剪枝：

非结构化剪枝：数量，把连接上的参数置0即为剪枝
结构化剪枝：改变结构，即网络层输出元素个数，比如卷积核的减少会影响特征图数量。
可以很精准的删除掉每一个非互相依赖的参数，或者一次删除更大的部分。越细粒度（非结构化的）剪枝，就会越精准，但是同时更难去加速inference。
另一方面，一次删除更大的部分（结构化剪枝）会没那么精准，但是使得稀疏阵计算更容易。所以剪枝的粒度对于精度和速度是个 trade-off ！

剪枝的方式：
One-shot Pruning
Iterative Pruning
Automatic Gradual Pruning

模型剪枝来压缩模型的流程：
1 构建模型，并训练得到一个模型
2 使用训练期间对应的剪枝API训练模型
3 生成的权重将包含许多零值
4 使用文件压缩库 gzip、bzip等进行压缩

哪些参数需要被剪枝？Magnitude 幅度也就是绝对值大小
Weight Magnitude Pruning --正向传播
Gradient Magnitude Pruning --反向传播

权重大小剪枝：
其实很简单，根据权重大小剪枝已经被证明很有效。它简单的根据粒度（weights/group/kernel/filter）计算L1正则项，然后根据保留参数的比例，去删除那些数值小的。

梯度大小剪枝：
根据梯度大小剪枝，仅有的不同是我们要用参数权重乘上对应的梯度，然后再计算L1正则项

剪枝的比率影响精度：

哪些层的参数更容易被剪掉：
因为卷积层（conv）中的参数相比全连接层（fc）来说，参数量少，所以卷积层参数的压缩比没有全连接层参数的压缩比大。换句话说，就是卷积层参数更加敏感，剪掉对准确率影响相对更大。
越靠后的卷积层或卷积层之后的那些全连接层往往参数越容易被剪掉。

4. 非结构化剪枝

根据连接重要性判断是否裁剪掉连接

W矩阵举例

numpy.percentile 是 NumPy 库中的一个函数，用于计算给定数组中的百分位数。百分位数是指在一个数据集中，某个百分比位置的值。例如，第 50 百分位数（即 50% 分位数）就是中位数。

import numpy as np

# 创建一个示例数组
data = np.array([1, 2, 3, 4, 5])

# 计算第 50 百分位数（中位数）
median = np.percentile(data, 50)
print(median)  # 输出: 3.0

# 计算多个百分位数
percentiles = np.percentile(data, [25, 50, 75])
print(percentiles)  # 输出: [2. 3. 4.]

numpy.percentile 函数在数据分析和统计学中非常有用，常用于计算数据的分布情况、异常值检测、数据归一化等。

可以做的事情，是对每一层都增加一个变量存储mask矩阵，mask矩阵就根据希望保留的参数比例来存储。
mask掩码矩阵是会随着迭代调整的，直到mask稳定下来，就完成真正的剪枝。训练时Loss中依然可以加上L1正则项。

误剪的参数是否可以恢复？
值得注意的是mask矩阵，它在正向传播和计算loss的时候肯定是要用到的，在反向传播的时候，被剪枝剪掉的参数，是否要进行调参呢？
可以有一定的概率sigma(iter)让它进行调参，其实随着iter迭代次数增加，可以发现概率就会越大。这样的话下次得到的mask矩阵就有可能是不同的。这有助于恢复一些被错误剪掉的连接参数。

4. Pruning neurons结构化剪枝

和非结构化剪枝很类似，并且训练时Loss中依然可以加上L1正则项。用卷积神经网络举例，根据卷积核的L1 Norm判断是否裁剪掉卷积核，具体就是计算卷积核们的绝对值之和，进行排序，剪掉和最小的核以及对应的特征图。

Layer_i 和 Layer{i+1} 之间的卷积核减少，是不影响
Layer{i+2} 层 tensor 形状的

减少B中间那些使得生成结果C的数值几乎不变的通道，间接的剪枝了前面的卷积核

基于Batch Normalization缩放因子进行剪枝
基于BN中的缩放因子γ来对不重要的通道进行裁剪，间接的剪枝了前面的卷积核。

BN：

sparsity：
往往最重要的参数就是模型最终稀疏度sparsity，它表示最终模型中有多少是0，80%就是最终80%的参数为0。含有多少水分

开始步数、结束步数，用于控制什么时候开始和结束剪枝训练，把开始步数设置为大于0的值可以给模型一定时间先收敛，然后再用优
化技术，一般来说这样的效果会比较好。

剪枝效果：
50%-70%左右的稀疏性，准确率降低幅度并不大
独立于量化技巧，通常与量化配合效果不错
可以通过微调尝试不同的参数组合。

代码1

import tensorflow as tf
from tensorflow import keras
import numpy as np
import tensorflow_model_optimization as tfmot

mnist = keras.datasets.mnist
(train_iamges, train_labels), (test_iamges, test_labels) = mnist.load_data()

train_iamges = train_iamges / 255
test_iamges = test_iamges / 255

model = keras.Sequential([
    keras.layers.InputLayer(input_shape=(28, 28)),
    keras.layers.Reshape(target_shape=(28, 28, 1)),
    keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation="relu"),
    keras.layers.MaxPool2D(pool_size=(2, 2)),
    keras.layers.Flatten(),
    keras.layers.Dense(10)
])

model.compile(
    optimizer="adam", 
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

model.fit(
    train_iamges,
    train_labels,
    epochs=4,
    validation_split=0.1
)
_, baseline_model_accuracy = model.evaluate(test_iamges, test_labels, verbose=0)
print("baseline_model_accuracy:", baseline_model_accuracy)

keras_file = "./models/mnist.h5"
tf.keras.models.save_model(model, keras_file, include_optimizer=False)
print("Save base model to file:", keras_file)

Epoch 1/4
1688/1688 [==============================] - 6s 3ms/step - loss: 0.3221 - accuracy: 0.9099 - val_loss: 0.1469 - val_accuracy: 0.9587
Epoch 2/4
1688/1688 [==============================] - 6s 3ms/step - loss: 0.1365 - accuracy: 0.9607 - val_loss: 0.0949 - val_accuracy: 0.9733
Epoch 3/4
1688/1688 [==============================] - 6s 3ms/step - loss: 0.0917 - accuracy: 0.9739 - val_loss: 0.0706 - val_accuracy: 0.9810
Epoch 4/4
1688/1688 [==============================] - 6s 3ms/step - loss: 0.0714 - accuracy: 0.9786 - val_loss: 0.0646 - val_accuracy: 0.9827
baseline_model_accuracy: 0.979200005531311
Save base model to file: ./models/mnist.h5

模型的剪枝这里针对整个模型进行剪枝

prune_low_magnitude = tfmot.sparsity.keras.prune_low_magnitude

batch_size = 128
epochs = 2
validation_split = 0.2
num_images = train_iamges.shape[0] * (1-validation_split)
end_step = np.ceil(num_images/batch_size).astype(np.int32) * epochs

pruning_param = {
    "pruning_schedule": tfmot.sparsity.keras.PolynomialDecay(
        initial_sparsity=0.5,
        final_sparsity=0.8,
        begin_step=0,
        end_step=end_step
    )
}
model_for_pruning = prune_low_magnitude(model, **pruning_param)

# 使用函数`prune_low_magnitude`包装了一下模型后，需要重新编译一下
model_for_pruning.compile(optimizer="adam",loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=["accuracy"])

model_for_pruning.summary()
logdir = "logs/mnist_pruning"

callbacks = [
    tfmot.sparsity.keras.UpdatePruningStep(), 
    #tfmot.sparsity.keras.PruningSummaries(log_dir=logdir)
]

model_for_pruning.fit(
    train_iamges, 
    train_labels, 
    batch_size=batch_size,
    epochs=epochs,
    validation_split=validation_split,
    callbacks=callbacks
)

_, model_for_pruning_accuracy = model_for_pruning.evaluate(test_iamges, test_labels, verbose=0)
print("model_for_pruning_accuracy:", model_for_pruning_accuracy)
print("baseline_model_accuracy:", baseline_model_accuracy)

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 prune_low_magnitude_reshape  (None, 28, 28, 1)        1         
 _1 (PruneLowMagnitude)                                          
                                                                 
 prune_low_magnitude_conv2d_  (None, 26, 26, 12)       230       
 1 (PruneLowMagnitude)                                           
                                                                 
 prune_low_magnitude_max_poo  (None, 13, 13, 12)       1         
 ling2d_1 (PruneLowMagnitude                                     
 )                                                               
                                                                 
 prune_low_magnitude_flatten  (None, 2028)             1         
 _1 (PruneLowMagnitude)                                          
                                                                 
 prune_low_magnitude_dense_1  (None, 10)               40572     
  (PruneLowMagnitude)                                            
                                                                 
=================================================================
Total params: 40,805
Trainable params: 20,410
Non-trainable params: 20,395
_________________________________________________________________
Epoch 1/2
422/422 [==============================] - 6s 8ms/step - loss: 0.1261 - accuracy: 0.9656 - val_loss: 0.1208 - val_accuracy: 0.9680
Epoch 2/2
422/422 [==============================] - 3s 8ms/step - loss: 0.1262 - accuracy: 0.9659 - val_loss: 0.1034 - val_accuracy: 0.9725
model_for_pruning_accuracy: 0.9682000279426575
baseline_model_accuracy: 0.9764000177383423

# 将剪枝之后的模型落盘保存
# 下面这个方法是必要的，因为它删除训练中剪枝所需要的变量，否则在推理的时候会增加模型大小
model_for_export = tfmot.sparsity.keras.strip_pruning(model_for_pruning)
pruned_keras_file = "./models/jack_mnist_pruned.h5"
tf.keras.models.save_model(model_for_export, pruned_keras_file, include_optimizer=False)
print("saved pruned keras model to:", pruned_keras_file)

WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
saved pruned keras model to: ./models/jack_mnist_pruned.h5

使用TFLite创建一个压缩的模型

convert = tf.lite.TFLiteConverter.from_keras_model(model_for_export)
pruned_tflite_model = convert.convert()
pruned_tflite_file = "./models/my_mnist_pruned.tflite"
with open(pruned_tflite_file, "wb") as f:
    f.write(pruned_tflite_model)
print("Saved pruned TFLite model to: ", pruned_tflite_file)

WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op while saving (showing 1 of 1). These functions will not be directly callable after loading.


INFO:tensorflow:Assets written to: C:\Users\C30004~1\AppData\Local\Temp\tmp6k45wb69\assets


INFO:tensorflow:Assets written to: C:\Users\C30004~1\AppData\Local\Temp\tmp6k45wb69\assets


Saved pruned TFLite model to:  ./models/my_mnist_pruned.tflite

使用TFLite创建一个压缩+量化的模型

convert = tf.lite.TFLiteConverter.from_keras_model(model_for_export)
convert.optimizations = [tf.lite.Optimize.DEFAULT]
quantized_and_pruned_tflite_model = convert.convert()
quantized_and_pruned_tflite_file = "./models/jack_mnist_quantized_and_pruned.tflite"
with open(quantized_and_pruned_tflite_file, "wb") as f:
    f.write(quantized_and_pruned_tflite_model)
print("saved quantized and pruned TFLite to:", quantized_and_pruned_tflite_file)

WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op while saving (showing 1 of 1). These functions will not be directly callable after loading.


INFO:tensorflow:Assets written to: C:\Users\C30004~1\AppData\Local\Temp\tmpqswuf_tm\assets


INFO:tensorflow:Assets written to: C:\Users\C30004~1\AppData\Local\Temp\tmpqswuf_tm\assets


saved quantized and pruned TFLite to: ./models/jack_mnist_quantized_and_pruned.tflite

# 评估模型
def evalute_model(interpreter):
    input_index = interpreter.get_input_details()[0]["index"]
    output_index = interpreter.get_output_details()[0]["index"]
    
    predict_digits = []
    for i, test_image in enumerate(test_iamges):
        if i % 1000 == 0:
            print("Evaluate on {n} results so far.".format(n=i))
            
        # 预处理: 每张图片添加一个批次维度，并且转换成 float32 格式去匹配模型的输入数据格式
        test_image = np.expand_dims(test_image, axis=0).astype(np.float32)
        interpreter.set_tensor(input_index, test_image)

        # 正向传播
        interpreter.invoke()

        output = interpreter.tensor(output_index)
        digit = np.argmax(output()[0])

        predict_digits.append(digit)
    print("\n")
    
    predict_digits = np.array(predict_digits)
    accuracy = (predict_digits==test_labels).mean()
    return accuracy

读取剪枝+量化的模型

interpreter = tf.lite.Interpreter(model_content=quantized_and_pruned_tflite_model)
interpreter.allocate_tensors()

test_accuracy = evalute_model(interpreter)
print("Pruned and quantized TF lite model accuracy:", test_accuracy)
print("Pruned TF accuracy:", model_for_pruning_accuracy)

Evaluate on 0 results so far.
Evaluate on 1000 results so far.
Evaluate on 2000 results so far.
Evaluate on 3000 results so far.
Evaluate on 4000 results so far.
Evaluate on 5000 results so far.
Evaluate on 6000 results so far.
Evaluate on 7000 results so far.
Evaluate on 8000 results so far.
Evaluate on 9000 results so far.


Pruned and quantized TF lite model accuracy: 0.9683
Pruned TF accuracy: 0.9682000279426575

4. 结构化剪枝代码

import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt
import numpy as np
import tensorflow_model_optimization as tfmot

prune_low_magnitude = tfmot.sparsity.keras.prune_low_magnitude

mnist = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labesl) = mnist.load_data()

train_images = train_images / 255.0
test_images = test_images / 255.0

# 4个里面2个 剪枝成0
pruning_param_2_by_4 = {
    "sparsity_m_by_n": (2, 4),
}
# 随机剪枝稀疏度为0.5
pruning_params_sparsity_0_5 = {
    "pruning_schedule": tfmot.sparsity.keras.ConstantSparsity(target_sparsity=0.5, begin_step=0, frequency=100)
}

# 定义模型结构，指出哪些层去做结构化剪枝，结构化剪枝需要基于你选择的模型层
# 在下面的例子中，我们仅仅对一些层剪枝:我们剪枝第二个卷积层和第一个全连接层
# 注意，第一个卷积层不能被结构化剪枝。要是结构化剪枝的话，应该至少大于一个input channels, 所以我们对第一个卷积层使用随机剪枝

# 未剪枝的
# model = keras.Sequential([
#     keras.layers.Conv2D(32, 5, padding="same",activation="relu", input_shape=(28, 28, 1),name="pruning_sparsity_0_5")
#     keras.layers.MaxPool2D((2, 2),(2, 2), padding="same"),
#     keras.layers.Conv2D(64, 5, padding="same", name='structual_pruning'),
#     keras.layers.BatchNormalization(),
#     keras.layers.ReLU(),
#     keras.layers.MaxPool2D((2,2), (2,2), padding="same"),
#     keras.layers.Flatten(),
#     keras.layers.Dense(1024, activation="relu", name='structural_pruning_dense'),
#     keras.layers.Dropout(0.4),
#     keras.layers.Dense(10)
# ])

# 剪枝的
model = keras.Sequential([
    prune_low_magnitude(
        keras.layers.Conv2D(32, 5, padding="same",activation="relu", input_shape=(28, 28, 1),name="pruning_sparsity_0_5"),
        **pruning_params_sparsity_0_5   # 第一层 随机稀疏度剪枝
    ),
    keras.layers.MaxPool2D((2, 2),(2, 2), padding="same"),
    prune_low_magnitude(
        keras.layers.Conv2D(64, 5, padding="same", name='structural_pruning'),  # 4个里面2个 剪枝成0
        **pruning_param_2_by_4
    ),
    keras.layers.BatchNormalization(),
    keras.layers.ReLU(),
    keras.layers.MaxPool2D((2,2), (2,2), padding="same"),
    keras.layers.Flatten(),
    prune_low_magnitude(
        keras.layers.Dense(1024, activation="relu", name='structural_pruning_dense'),  # 4个里面2个 剪枝成0
        **pruning_param_2_by_4
    ),
    keras.layers.Dropout(0.4),
    keras.layers.Dense(10)
])


model.compile(
    optimizer="adam",
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

model.summary()


batch_size = 128
epochs = 2

model.fit(
    train_images, 
    train_labels,
    batch_size=batch_size,
    epochs=epochs,
    verbose=0,
    callbacks=tfmot.sparsity.keras.UpdatePruningStep(),
    validation_split=0.1
)

Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 prune_low_magnitude_pruning  (None, 28, 28, 32)       1634      
 _sparsity_0_5 (PruneLowMagn                                     
 itude)                                                          
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 14, 14, 32)       0         
 2D)                                                             
                                                                 
 prune_low_magnitude_structu  (None, 14, 14, 64)       102466    
 ral_pruning (PruneLowMagnit                                     
 ude)                                                            
                                                                 
 batch_normalization_1 (Batc  (None, 14, 14, 64)       256       
 hNormalization)                                                 
                                                                 
 re_lu_1 (ReLU)              (None, 14, 14, 64)        0         
                                                                 
 max_pooling2d_3 (MaxPooling  (None, 7, 7, 64)         0         
 2D)                                                             
                                                                 
 flatten_1 (Flatten)         (None, 3136)              0         
                                                                 
 prune_low_magnitude_structu  (None, 1024)             6423554   
 ral_pruning_dense (PruneLow                                     
 Magnitude)                                                      
                                                                 
 dropout_1 (Dropout)         (None, 1024)              0         
                                                                 
 dense_1 (Dense)             (None, 10)                10250     
                                                                 
=================================================================
Total params: 6,538,160
Trainable params: 3,274,762
Non-trainable params: 3,263,398
_________________________________________________________________





<keras.callbacks.History at 0x21e1b56e750>

_, prunned_model_accuracy = model.evaluate(test_images,test_labesl, verbose=0)
print("Pruned test accuracy:", prunned_model_accuracy)

Pruned test accuracy: 0.984499990940094

# 移除剪枝的包装，这样当转化成tflite格式时，不会被包含在模型中
model = tfmot.sparsity.keras.strip_pruning(model)

converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

tflite_file = './models/mnist_structural_pruned.tflite'
print('Saved converted pruned model to:', tflite_file)
with open(tflite_file, 'wb') as f:
    f.write(tflite_model)

WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _jit_compiled_convolution_op while saving (showing 2 of 2). These functions will not be directly callable after loading.


INFO:tensorflow:Assets written to: C:\Users\C30004~1\AppData\Local\Temp\tmplsys96ze\assets


INFO:tensorflow:Assets written to: C:\Users\C30004~1\AppData\Local\Temp\tmplsys96ze\assets


Saved converted pruned model to: ./models/mnist_structural_pruned.tflite

# 可视化并检查权重
interpreter = tf.lite.Interpreter(model_path=tflite_file)
interpreter.allocate_tensors()

details = interpreter.get_tensor_details()

tensor_name = "structural_pruning_dense/MatMul"
detail = [x for x in details if tensor_name in x["name"]]
print("tensor_name:", tensor_name)
print(detail)

tensor_data = interpreter.tensor(detail[0]["index"])()
print("shape of weight tensor is :", tensor_data.shape)  # (1024, 3136)

tensor_name: structural_pruning_dense/MatMul
[{'name': 'sequential_3/structural_pruning_dense/MatMul', 'index': 8, 'shape': array([1024, 3136]), 'shape_signature': array([1024, 3136]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'sequential_3/structural_pruning_dense/MatMul;sequential_3/structural_pruning_dense/Relu;sequential_3/structural_pruning_dense/BiasAdd', 'index': 15, 'shape': array([   1, 1024]), 'shape_signature': array([  -1, 1024]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
shape of weight tensor is : (1024, 3136)

width=height=24
subset_values_to_display = tensor_data[0:height, 0:width]  # 权重矩阵里面切出一小片

val_ones = np.ones([height, width])
val_zeros = np.zeros([height, width])

subset_values_to_display = np.where(abs(subset_values_to_display) > 0, val_ones, val_zeros)

def plot_seperation_lines(height, width):
    block_size = [1, 4]
    
    # 在图里面添加分割线
    
    num_hlines = int((height-1)/block_size[0])
    num_vlines = int((width-1)/block_size[1])
    
    line_y_pos = [y*block_size[0] for y in range(1, num_hlines+1)]
    line_x_pos = [x*block_size[1] for x in range(1, num_vlines+1)]
    
    for y_pos in line_y_pos:
        plt.plot([-0.5, width], [y_pos-0.5, y_pos-0.5], color='w')
        
    for x_pos in line_x_pos:
        plt.plot([x_pos-0.5, x_pos-0.5], [-0.5, height], color="w")
    
plot_seperation_lines(height, width)
plt.axis("off")
plt.imshow(subset_values_to_display)
plt.colorbar()
plt.title("Structural pruning for Dense layer")
plt.show()  # 严格的4_2

png

# 对卷积层权重可视化，结构化稀疏应用于最后一个维度
tensor_name = 'structural_pruning/Conv2D'
detail = [x for x in details if tensor_name in x["name"]]
print(detail)
tensor_data = interpreter.tensor(detail[1]["index"])()
print(f"Shape of the weight tensor is {tensor_data.shape}")

[{'name': 'sequential_3/re_lu_1/Relu;sequential_3/batch_normalization_1/FusedBatchNormV3;sequential_3/structural_pruning/BiasAdd/ReadVariableOp;sequential_3/structural_pruning/BiasAdd;sequential_3/structural_pruning/Conv2D', 'index': 2, 'shape': array([64]), 'shape_signature': array([64]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'sequential_3/structural_pruning/Conv2D', 'index': 4, 'shape': array([64,  5,  5, 32]), 'shape_signature': array([64,  5,  5, 32]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'sequential_3/re_lu_1/Relu;sequential_3/batch_normalization_1/FusedBatchNormV3;sequential_3/structural_pruning/BiasAdd/ReadVariableOp;sequential_3/structural_pruning/BiasAdd;sequential_3/structural_pruning/Conv2D1', 'index': 12, 'shape': array([ 1, 14, 14, 64]), 'shape_signature': array([-1, 14, 14, 64]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
Shape of the weight tensor is (64, 5, 5, 32)

weights_to_display = tf.reshape(tensor_data, [tf.reduce_prod(tensor_data.shape[:-1]), -1])
print(weights_to_display.shape)
weights_to_display = weights_to_display[0:width, 0:height]

val_ones = np.ones([height, width])
val_zeros = np.zeros([height, width])
subset_values_to_display = np.where(abs(weights_to_display) > 1e-9, val_ones, val_zeros)

plot_separation_lines(height, width)

plt.axis('off')
plt.imshow(subset_values_to_display)
plt.colorbar()
plt.title("Structurally pruned weights for Conv2D layer")
plt.show()

(10, 1024)



---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

Cell In[97], line 7
      5 val_ones = np.ones([height, width])
      6 val_zeros = np.zeros([height, width])
----> 7 subset_values_to_display = np.where(abs(weights_to_display) > 1e-9, val_ones, val_zeros)
      9 plot_separation_lines(height, width)
     11 plt.axis('off')


File <__array_function__ internals>:180, in where(*args, **kwargs)


ValueError: operands could not be broadcast together with shapes (10,24) (24,24) (24,24)

# 获取随机剪枝的卷积层权重
tensor_name = 'pruning_sparsity_0_5/Conv2D'
detail = [x for x in details if tensor_name in x["name"]]
print(detail)
tensor_data = interpreter.tensor(detail[1]["index"])()
print(f"Shape of the weight tensor is {tensor_data.shape}")

weights_to_display = tf.reshape(tensor_data, [tensor_data.shape[0], tf.reduce_prod(tensor_data.shape[1:])])
weights_to_display = weights_to_display[0:width, 0:height]

val_ones = np.ones([height, width])
val_zeros = np.zeros([height, width])
subset_values_to_display = np.where(abs(weights_to_display) > 0, val_ones, val_zeros)

plot_separation_lines(height, width)

plt.axis('off')
plt.imshow(subset_values_to_display)
plt.colorbar()
plt.title("Unstructed pruned weights for Conv2D layer")
plt.show()

[{'name': 'sequential_3/pruning_sparsity_0_5/Conv2D', 'index': 3, 'shape': array([32,  5,  5,  1]), 'shape_signature': array([32,  5,  5,  1]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'sequential_3/pruning_sparsity_0_5/Relu;sequential_3/pruning_sparsity_0_5/BiasAdd;sequential_3/pruning_sparsity_0_5/Conv2D;sequential_3/pruning_sparsity_0_5/BiasAdd/ReadVariableOp', 'index': 10, 'shape': array([ 1, 28, 28, 32]), 'shape_signature': array([-1, 28, 28, 32]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]

posted @ 2024-07-03 11:13 jack-chen666 阅读(7) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

红豆生南国是很遥远的事情

种豆南山下 github

深度学习--模型优化--模型的剪枝--92

1. 模型压缩

2. 神经网络剪枝

4. 非结构化剪枝

4. Pruning neurons结构化剪枝

代码1

模型的剪枝这里针对整个模型进行剪枝

使用TFLite创建一个压缩的模型

使用TFLite创建一个压缩+量化的模型

读取剪枝+量化的模型

4. 结构化剪枝代码

公告

红豆生南国 是很遥远的事情

种豆南山下 github

深度学习--模型优化--模型的剪枝--92

1. 模型压缩

2. 神经网络剪枝

4. 非结构化剪枝

4. Pruning neurons结构化剪枝

代码1

模型的剪枝 这里针对整个模型进行剪枝

使用TFLite创建一个压缩的模型

使用TFLite创建一个压缩+量化的模型

读取剪枝+量化的模型

4. 结构化剪枝代码

公告

红豆生南国是很遥远的事情

模型的剪枝这里针对整个模型进行剪枝