tensorFlow进行简单的图像处理

图像编码处理

一张RGB三通道的彩色图像可以看成一个三维矩阵，矩阵中的不位置上的数字代表图像的像素值。然后图像在存储时并不是直接记录这些矩阵中的数字，而是经过了压缩编码。所以将一张图像还原成一个三维矩阵的过程就是解码的过程，反之就是编码了。

tensorflow提供了对jpeg、png等常见图像格式的编码/解码函数，以下代码示范tensorflow对jpeg格式图像的编码/解码函数。

import tensorflow as tf
import matplotlib.pyplot as plt


image_raw_data = tf.gfile.FastGFile('img/1.jpg','rb').read()

with tf.Session() as sess:
     img_data = tf.image.decode_jpeg(image_raw_data)
     print(img_data.eval())

     plt.imshow(img_data.eval())
     plt.show()

     #img_data = tf.image.convert_image_dtype(img_data,dtype = tf.float32)

     encoded_image = tf.image.encode_jpeg(img_data)
     with tf.gfile.GFile("img/2.jpg","wb") as f:
          f.write(encoded_image.eval())

输出：

[[ 63 54 37]
[ 72 63 46]
[ 77 68 51]
   ...
   [ 68 85 53]
   [ 72 87 66]
   [ 74 91 75]]

  [[ 73 64 47]
   [ 80 71 54]
   [ 85 76 59]
   ...
   [ 71 88 56]
   [ 75 90 69]
   [ 76 93 77]]] (太长了，略一下)

其中： tf.image.decode_jpeg()函数为jpeg（jpg）图片解码的过程，对应的encode_jpeg函数为编码过程，编码后将图片重命名写入到指定的路径下。类似的函数还有tf.image.encode_png(),可以对图片进行PNG编码.。

图像尺寸调整

在Tensorflow中通过tf.image.resize_images()函数实现对图像尺寸的调整，函数定义为：

                                 def resize_images(images,
                                                   size,
                                                   method=ResizeMethod.BILINEAR,
                                                   align_corners=False):

第一个输入参数：image，输入图像，形状为[batch, height, width, channels]的4-D张量或形状为[height, width, channels]的3-D张量；

第二个输入参数：size，2个元素(new_height, new_width)的1维int32张量,表示图像的新大小，如：[300,300]；

第三个输入参数：method，可选参数：:(method=0：双线性插值算法（Bilinear interpolation），

method=1：最近邻居法（Nearest neighbor interpolation)，

method=2：双三次插值法（Bicubic interpolation)，

method=3：面积插值法（Area interpolation))；

第四个输入参数：align_corners，布尔型,如果为True,则输入和输出张量的4个拐角像素的中心对齐,并且保留角落像素处的值，默认为False.；

import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
 
image_raw_data = tf.gfile.GFile('img/1.jpg','rb').read()   #加载原始图像
 
with tf.Session() as sess:
    img_data = tf.image.decode_jpeg(image_raw_data)
    plt.imshow(img_data.eval())
    plt.show()
 
    resized = tf.image.resize_images(img_data, [300,300],method=0)  #第一个参数为原始图像，第二个参数为图像大小，第三个参数给出了指定的算法
    resized = np.asarray(resized.eval(),dtype='uint8')
    plt.imshow(resized)
    plt.show()

输出：

此外，类似的函数还有很多，比如：resize_area(...)：调整images为size使用区域插值，

resize_bicubic(...)：调整images为size使用双三次插值，

resize_bilinear(...)：调整images为size使用双线性插值.， resize_image_with_crop_or_pad(...)：裁剪或将图像填充到目标宽度和高度，

resize_nearest_neighbor(...)：使用最近的相邻插值调整图像的大小。

图像翻转

tensorflow提供了一些函数来支持对图像的翻转，主要函数有：

tf.image.flip_left_right(...):水平翻转图像(从左到右)；

tf.image.flip_up_down(...):垂直翻转图像(颠倒)；

tf.image.random_flip_left_right(...)：水平(从左到右)随机翻转图像；

tf.image.random_flip_up_down(...)：垂直(颠倒)随机翻转图像。

# coding:utf-8

import matplotlib.pyplot as plt
import tensorflow as tf
# 读取图像数据
img = tf.gfile.FastGFile('img/1.jpg','rb').read()

with tf.Session() as sess:
    img_data = tf.image.decode_jpeg(img)

    flipped0 = tf.image.flip_up_down(img_data)

    flipped1 = tf.image.flip_left_right(img_data)

    flipped2 = tf.image.transpose_image(img_data)

    plt.subplot(221), plt.imshow(img_data.eval()), plt.title('original')
    plt.subplot(222), plt.imshow(flipped0.eval()), plt.title('flip_up_down')
    plt.subplot(223), plt.imshow(flipped1.eval()), plt.title('flip_left_right')
    plt.subplot(224), plt.imshow(flipped2.eval()), plt.title('transpose_image')
    plt.tight_layout()

    plt.show()

输出：

图像色彩调整

调整图像对比度、亮度、饱和度和色相在很多图像识别应用中都不会影像识别结果，所以在训练神经网络模型时，可以随机调整训练图像的这些属性，从而使训练的模型尽可能小的受到无关因素的影响，tensorflow提供了很多调整这些相关属性的AP-I，主要函数有：

adjust_brightness(...)：调整RGB或灰度图像的亮度；

adjust_contrast(...)：调整RGB或灰度图像的对比度；

adjust_gamma(...)：对输入图像执行Gamma校正；

adjust_hue(...)：调整RGB图像的色调；

adjust_saturation(...)：调整RGB图像的饱和度；

tf.image.per_image_standardization(...)：标准化调整。

# coding:utf-8

import matplotlib.pyplot as plt
import tensorflow as tf
# 读取图像数据
img = tf.gfile.FastGFile('img/1.jpg','rb').read()

with tf.Session() as sess:
    img_data = tf.image.decode_jpeg(img)
    
    #第二个参数可修改
    adjusted0 = tf.image.adjust_brightness(img_data, -0.5)
    adjusted1 = tf.image.adjust_contrast(img_data, -5)
    adjusted2 = tf.image.adjust_saturation(img_data, 0.1)
    adjusted3 = tf.image.adjust_saturation(img_data, -5)
    adjusted4 = tf.image.per_image_standardization(img_data)

    plt.subplot(231), plt.imshow(img_data.eval()), plt.title('original')
    plt.subplot(232), plt.imshow(adjusted0.eval()), plt.title('adjust_brightness')
    plt.subplot(233), plt.imshow(adjusted1.eval()), plt.title('adjust_contrast')
    plt.subplot(234), plt.imshow(adjusted2.eval()), plt.title('adjust_saturation')
    plt.subplot(235), plt.imshow(adjusted3.eval()), plt.title('adjust_saturation')
    plt.subplot(236), plt.imshow(adjusted4.eval()), plt.title('per_image_standardization')
    plt.tight_layout()

    plt.show()

输出：

处理标注框

在很多图像识别的数据集中，图像中需要关注的物体通常会被标注框圈出来。tensorflow提供了一些工具来处理标注框。首先介绍两个函数：

tf.image.draw_bounding_boxes()：在指定坐标圈标注框；

tf.image.sample_distorted_bounding_box()：随机坐标画标注框。

# coding:utf-8

import matplotlib.pyplot as plt
import tensorflow as tf
# 读取图像数据
img = tf.gfile.FastGFile('img/1.jpg','rb').read()

with tf.Session() as sess:
    img_data = tf.image.decode_jpeg(img)
    expand = tf.expand_dims(tf.image.convert_image_dtype(img_data, dtype=tf.float32),0)

    boxes = tf.constant([[[0.07, 0.45, 0.3, 0.6]]])
    
    result = tf.image.draw_bounding_boxes(images=expand, boxes=boxes)

    boxes2 = tf.constant([[[0.07, 0.45, 0.3, 0.6], [0.5, 0.2, 0.7, 0.4]]])

    batched2 = tf.expand_dims(tf.image.convert_image_dtype(img_data, dtype=tf.float32), 0)

    img_withbox = tf.image.draw_bounding_boxes(images=batched2, boxes=boxes2)

    begin, size, bbox_for_draw = tf.image.sample_distorted_bounding_box(
        tf.shape(img_data), bounding_boxes=boxes2, min_object_covered=0.1
    )

    distorted_image = tf.slice(img_data, begin=begin, size=size)

    plt.subplot(131), plt.imshow(img_data.eval()), plt.title('original')
    plt.subplot(132), plt.imshow(result.eval().reshape([500,500,3])), plt.title('draw_bounding_box')
    plt.subplot(133), plt.imshow(distorted_image.eval()), plt.title('sample_distored_bounding_box')

    plt.tight_layout()

    plt.show()

输出：

注意：tf.image.draw_bounding_box()的输入图像是多张图像组成的四维矩阵，所以需要将解码后的图像矩阵加一维，使用了tf。expand_dims()函数。

参考文献：https://www.w3cschool.cn/tensorflow_python/tensorflow_python-m7ku2pvd.html

posted @ 2019-03-06 15:36 鲁太师阅读(421) 评论(0) 编辑收藏举报

刷新页面返回顶部

鲁太师