返回顶部

机器学习 Tensorflow 线程队列与IO操作

Tensorflow队列

在训练样本的时候,希望读入的训练样本时有序的

tf.FIFOQueue 先进先出队列,按顺序出队列

tf.RandomShuffleQueue 随机出队列

tf.FIFOQueue

FIFOQueue(capacity, dtypes, name='fifo_queue')创建一个以先进先出的顺序对元素进行排队的队列

  • capacity:整数。可能存储在此队列中的元素数量的上限

  • dtypes:DType对象列表。长度dtypes必须等于每个队列元素中的张量数,dtype的类型形状,决定了后面进队列元素形状

method

  • dequeue(name=None) 出列

  • enqueue(vals, name=None): 入列

  • enqueue_many(vals, name=None):vals列表或者元组返回一个进队列操作

  • size(name=None), 返回一个tensor类型的对象, 包含的value是整数

案例(同步操作一个出队列、+1、入队列操作)

入队列需要注意

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import tensorflow as tf
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
 
 
def fifoqueue():
    # 创建队列指定队列的元素
    queue = tf.FIFOQueue(3, tf.float32)
    # 向队列中添加元素
    en_many = queue.enqueue_many([[0.1, 0.2, 0.3], ])
    # 定义一个出列的操作
    deq_op = queue.dequeue()
    # 对于出列的对象 +1
    # 实现了运算符的重载, 如果是加号 可以将 1转换为tensor类型 并且调用 add
    incre_op = deq_op + 1
    # 让 +1的对象在重新入列
    enq_op = queue.enqueue(incre_op)
    # 必须在会话中运行op对象
    # 以下的操作都是在主线程中完成的都是同步操作
    with tf.Session() as sess:
        # 运行添加元素的op (0.1, 0.2, 0.3)
        sess.run(en_many)
 
        # # 完成值的处理操作
        for i in range(3):
            sess.run(enq_op)
 
        # # 将队列的数据取出, 将数据交给模型开始训练
        for i in range(queue.size().eval()):
            ret = sess.run(deq_op)
            print(ret)
 
 
if __name__ == '__main__':
    fifoqueue()

分析:当数据量很大时,入队操作从硬盘中读取数据,放入内存中,主线程需要等待入队操作完成,才能进行训练。会话里可以运行多个线程,实现异步读取。

队列管理器

tf.train.QueueRunner(queue, enqueue_ops=None)创建一个QueueRunner

  • queue:A Queue

  • enqueue_ops:添加线程的队列操作列表,[]*2,指定两个线

create_threads(sess, coord=None,start=False)创建线程来运行给定会话的入队操作

  • start:布尔值,如果True启动线程;如果为False调用者

  • 必须调用start()启动线

  • coord:线程协调器,后面线程管理需要用到

异步操作

  • 通过队列管理器来实现变量加1,入队,主线程出队列的操作,观察效果

分析:

  • 这时候有一个问题就是,入队自顾自的去执行,在需要的出队操作完成之后,程序没法结束。需要一个实现线程间的同步,终止其他线程。

线程协调器

tf.train.Coordinator()

  • 线程协调员,实现一个简单的机制来协调一组线程的终止
  • request_stop()

  • should_stop() 检查是否要求停止(一般不用

  • join(threads=None, stop_grace_period_secs=120)等待线程终止

return:线程协调员实例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
import tensorflow as tf
 
 
def async_opration():
    """
    通过队列管理器和线程协调器实现变量+1的
    :return: None
    """
    # 定义一个队列 容量1000, 类型tf.float32
    queue = tf.FIFOQueue(1000, tf.float32)
    # 完成一个自增的操作 并且入列的操作
    var = tf.Variable(0.0)
    # assign_add操作 和 enq_op 不是同步执行, assign_add操作有可能执行很多次才会执行enq_op操作
    incre_op = tf.assign_add(var, tf.constant(1.0))
    # 入列
    enq_op = queue.enqueue(incre_op)
    # 出列
    deq_op = queue.dequeue()
 
    # 定义队列管理器
    qr = tf.train.QueueRunner(queue=queue, enqueue_ops=[enq_op] * 2)
 
    init_op = tf.global_variables_initializer()
    # 通过with上下文创建的会话会自动关闭, 主线程已经执行完毕了
    # 子线程会自动停止吗? 子线程并不会退出 而是一种挂起的状态
    with tf.Session() as sess:
    # sess = tf.Session()
        sess.run(init_op)
        # 创建线程协调器
        coord = tf.train.Coordinator()
        # 通过队列管理器来创建需要执行的入列的线程
        # start为True 表示创建的线程 需要立即开启, enqueue的操作已经开始执行, 并且是两个线程在执行
        threads = qr.create_threads(sess=sess, coord=coord, start=True)
        # 入列的操作在另外一个线程执行
        for i in range(1000):
            # 主线程deq 出列
            ret = sess.run(deq_op)
            print(ret)
        # 主线程的任务执行结束
        # 应该请求结束子线程
        coord.request_stop()
        # coord.should_stop()
        # 加上线程同步
        coord.join(threads=threads)
 
    return None
 
 
if __name__ == '__main__':
    async_opration()

tensorflow文件读取

文件读取流程

1、文件读取API-文件队列构造

  • tf.train.string_input_producer(string_tensor,,shuffle=True)将输出字符串(例如文件名)输入到管道队列

  • string_tensor 含有文件名的1阶张量

  • num_epochs:过几遍数据,默认无限过数据

  • return:具有输出字符串的队列

2、文件读取API-文件阅读器

根据文件格式,选择对应的文件阅读器

class tf.TextLineReader

  • 阅读文本文件逗号分隔值(CSV)格式,默认按行读取

  • return:读取器实例

tf.FixedLengthRecordReader(record_bytes)

  • 要读取每个记录是固定数量字节的二进制文件

  • record_bytes:整型,指定每次读取的字节数

  • return:读取器实例

tf.TFRecordReader

  • 读取TfRecords文件

有一个共同的读取方法:

  • read(file_queue):从队列中读取指定数量内容

  • 返回一个Tensors元组(key文件名字,value默认的内容(行或者字节或者图片))

3、文件读取API-文件内容解码器

由于从文件中读取的是字符串,需要函数去解析这些字符串到张量

tf.decode_csv(records,record_defaults=None,field_delim = None,name = None)

  • 将CSV转换为张量,与tf.TextLineReader搭配使用

  • records:tensor型字符串,每个字符串是csv中的记录行

  • record_defaults:指定分割后每个属性的类型,比如分割后会有三列,第二个参数就应该是[[1],[],['string']],不指定类型(设为空[])也可以。如果分割后的属性比较多,比如有100个,可以用[[ ] * 100]来表示

  • field_delim:默认分割符”,”

tf.decode_raw(bytes,out_type,little_endian = None,name = None)

  • 将字节转换为一个数字向量表示,字节为一字符串类型的张量,与函数tf.FixedLengthRecordReader搭配使用,将字符串表示的二进制读取为uint8格式

开启线程操作

tf.train.start_queue_runners(sess=None,coord=None) 收集所有图中的队列线程,并启动线程

  • sess:所在的会话中

  • coord:线程协调器

  • return:返回所有线程队列

如果读取的文件为多个或者样本数量为多个,怎么去管道读取?

管道读端批处理

tf.train.batch(tensors,batch_size,num_threads = 1,capacity = 32,name=None)读取指定大小(个数)的张量

  • tensors:可以是包含张量的列表

  • batch_size:从队列中读取的批处理大小

  • num_threads:进入队列的线程数

  • capacity:整数,队列中元素的最大数量

  • return:tensors

tf.train.shuffle_batch(tensors,batch_size,capacity,min_after_dequeue,     num_threads=1,) 

  • 乱序读取指定大小(个数)的张量

  • min_after_dequeue:留下队列里的张量个数,能够保持随机打乱

文件读取案例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import tensorflow as tf
import os
 
 
def csv_reader():
    # 获取./data/csvdata/ 路径所有的文件
    file_names = os.listdir('./csvdata/')
    file_names = [os.path.join('./csvdata/', file_name) for file_name in file_names]
    # file_names = ["./data/csvdata/" + file_name  for file_name in file_names]
 
    print(file_names)
    # 通过文件名创建文件队列 file_queue
    file_queue = tf.train.string_input_producer(file_names)
 
    # 创建文件读取器 reader 按行读取
    reader = tf.TextLineReader()
 
    # 通过reader对象调用read  reader.read(file_queue)
    # 返回的结果 是 key value value 指的是某一个文件的一行
    key, value = reader.read(file_queue)
    print(key, value)
    # 对value 进行decode操作
    col1, col2 = tf.decode_csv(value, record_defaults=[['null'], ['null']],field_delim=',')
 
    # 建立管道读的批处理
    col1_batch, col2_batch = tf.train.batch(tensors=[col1, col2],  batch_size=100, num_threads=2, capacity=10)
 
    with tf.Session() as sess:
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(sess=sess, coord=coord)
 
        ret = sess.run([col1_batch, col2_batch])
        print(ret)
        # 主线程的任务执行完毕之后, 应该请求关闭子线程
        coord.request_stop()
        coord.join(threads)
 
 
if __name__ == '__main__':
    csv_reader()
复制代码
"C:\Program Files\Python36\python.exe" D:/数据分析/机器学习/day5/3-代码/day5_test.py
['./csvdata/A.csv', './csvdata/B.csv', './csvdata/C.csv']
Tensor("ReaderReadV2:0", shape=(), dtype=string) Tensor("ReaderReadV2:1", shape=(), dtype=string)
2020-01-13 22:51:39.323455: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2020-01-13 22:51:39.324455: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2020-01-13 22:51:39.324455: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2020-01-13 22:51:39.325455: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2020-01-13 22:51:39.325455: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2020-01-13 22:51:39.326455: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
[array([b'Sea1', b'Sea2', b'Sea3', b'Alpha1', b'Alpha2', b'Alpha3',
       b'Bee1', b'Bee2', b'Bee3', b'Sea1', b'Sea2', b'Sea3', b'Alpha1',
       b'Alpha2', b'Alpha3', b'Bee1', b'Bee2', b'Bee3', b'Alpha1',
       b'Alpha2', b'Alpha3', b'Sea1', b'Sea2', b'Sea3', b'Bee1', b'Bee2',
       b'Bee3', b'Sea1', b'Sea2', b'Sea3', b'Bee1', b'Bee2', b'Bee3',
       b'Alpha1', b'Alpha2', b'Alpha3', b'Sea1', b'Sea2', b'Sea3',
       b'Bee1', b'Bee2', b'Bee3', b'Alpha1', b'Alpha2', b'Alpha3',
       b'Sea1', b'Sea2', b'Sea3', b'Alpha1', b'Alpha2', b'Alpha3',
       b'Bee1', b'Bee2', b'Bee3', b'Bee1', b'Bee2', b'Bee3', b'Sea1',
       b'Sea2', b'Sea3', b'Alpha1', b'Alpha2', b'Alpha3', b'Sea1',
       b'Sea2', b'Sea3', b'Alpha1', b'Alpha2', b'Alpha3', b'Bee1',
       b'Bee2', b'Bee3', b'Alpha1', b'Alpha2', b'Alpha3', b'Sea1',
       b'Sea2', b'Sea3', b'Bee1', b'Bee2', b'Bee3', b'Sea1', b'Sea2',
       b'Sea3', b'Bee1', b'Bee2', b'Bee3', b'Alpha1', b'Alpha2',
       b'Alpha3', b'Bee1', b'Bee2', b'Bee3', b'Sea1', b'Sea2', b'Sea3',
       b'Alpha1', b'Alpha2', b'Alpha3', b'Bee1'], dtype=object), array([b'C1', b'C2', b'C3', b'A1', b'A2', b'A3', b'B1', b'B2', b'B3',
       b'C1', b'C2', b'C3', b'A1', b'A2', b'A3', b'B1', b'B2', b'B3',
       b'A1', b'A2', b'A3', b'C1', b'C2', b'C3', b'B1', b'B2', b'B3',
       b'C1', b'C2', b'C3', b'B1', b'B2', b'B3', b'A1', b'A2', b'A3',
       b'C1', b'C2', b'C3', b'B1', b'B2', b'B3', b'A1', b'A2', b'A3',
       b'C1', b'C2', b'C3', b'A1', b'A2', b'A3', b'B1', b'B2', b'B3',
       b'B1', b'B2', b'B3', b'C1', b'C2', b'C3', b'A1', b'A2', b'A3',
       b'C1', b'C2', b'C3', b'A1', b'A2', b'A3', b'B1', b'B2', b'B3',
       b'A1', b'A2', b'A3', b'C1', b'C2', b'C3', b'B1', b'B2', b'B3',
       b'C1', b'C2', b'C3', b'B1', b'B2', b'B3', b'A1', b'A2', b'A3',
       b'B1', b'B2', b'B3', b'C1', b'C2', b'C3', b'A1', b'A2', b'A3',
       b'B1'], dtype=object)]

Process finished with exit code 0
输出结果
复制代码

tensorflow图像读取

图像数字化三要素

  • 三要素:长度、宽度、通道数

三要素与张量的关系

图像基本操作

目的:

  • 1、增加图片数据的统一性

  • 2、所有图片转换成指定大小

  • 3、缩小图片数据量,防止增加开销

操作:

  • 1、缩小图片大小

图像基本操作API

  • tf.image.resize_images(images, size)缩小图片

  • images:4-D形状[batch, height, width, channels]或3-D形状的张量[height, width, channels]的图片数据

  • size:1-D int32张量:new_height, new_width,图像的新尺寸返回4-D格式或者3-D格式图片

图像读取API

图像读取器

  • tf.WholeFileReader 将文件的全部内容作为值输出的读取器

  • return:读取器实例

  • read(file_queue):输出将是一个文件名(key)和该文件的内容(值)

图像解码器

tf.image.decode_jpeg(contents)

  • 将JPEG编码的图像解码为uint8张量

  • return:uint8张量,3-D形状[height, width, channels]

tf.image.decode_png(contents)

  • 将PNG编码的图像解码为uint8或uint16张量

  • return:张量类型,3-D形状[height, width, channels]

图片批处理案例流程

  • 1、构造图片文件队列

  • 2、构造图片阅读器

  • 3、读取图片数据

  • 4、处理图片数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import tensorflow as tf
import os
 
 
def pic_reader():
    file_names = os.listdir('./dog/')
    file_names = [os.path.join('./dog/', file_name) for file_name in file_names]
    # 创建文件队列
    file_queue = tf.train.string_input_producer(file_names)
    # 创建读取器
    reader = tf.WholeFileReader()
    # key是文件名, value图片的数组数据
    key, value = reader.read(file_queue)
    # 通过解码的方式获取value的信息
    image = tf.image.decode_jpeg(value)
    # 在进行图片的批处理之前 需要讲图片的形状修改为一样的 [200, 200,?] -->  [height, width,?]
    resize_image = tf.image.resize_images(image, size=[200,200])
    # 设置图片的管道, [200, 200,?]  --> [200, 200, None]图片的形状还没有固定,可以通过set_shape
    resize_image.set_shape([200, 200, 3])
    print(resize_image)
    # 要去进行批处理的时候还需要知道图片的通道数
    image_batch = tf.train.batch(tensors=[resize_image],batch_size=100, num_threads=2,capacity=100)
    print(image_batch)
    with tf.Session() as sess:
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(sess, coord=coord)
        ret = sess.run(image_batch)
        print(ret)
 
        coord.request_stop()
        coord.join(threads)
 
 
if __name__ == '__main__':
    pic_reader()
复制代码
"C:\Program Files\Python36\python.exe" D:/数据分析/机器学习/day5/3-代码/day5_test.py
Tensor("Squeeze:0", shape=(200, 200, 3), dtype=float32)
Tensor("batch:0", shape=(100, 200, 200, 3), dtype=float32)
2020-01-13 23:34:10.831393: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2020-01-13 23:34:10.831393: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2020-01-13 23:34:10.832393: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2020-01-13 23:34:10.832393: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2020-01-13 23:34:10.832393: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2020-01-13 23:34:10.833393: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
[[[[ 33.          47.          86.        ]
   [ 36.725       50.725       88.235     ]
   [ 40.45        54.45        91.45      ]
   ...
   [  6.2350006    3.2350006    0.        ]
   [  6.           3.           0.        ]
   [  6.           3.           0.        ]]

  [[ 31.03        45.03        84.03      ]
   [ 33.28735     47.28735     86.265     ]
   [ 36.027348    50.027348    88.03205   ]
   ...
   [  6.988525     5.9585247    1.97      ]
   [  6.985        5.955        1.97      ]
   [  6.985        5.955        1.97      ]]

  [[ 36.82        49.85        93.7       ]
   [ 37.6097      50.639698    93.044395  ]
   [ 38.8894      52.394703    92.8494    ]
   ...
   [  9.167951     7.1979504    8.047951  ]
   [  7.9506106    5.980611     6.8306108 ]
   [  7.           5.0299997    5.88      ]]

  ...

  [[ 11.460022    14.325027    19.325027  ]
   [  5.3584476    6.089023     8.854024  ]
   [ 12.266381     7.366382     7.406382  ]
   ...
   [ 21.872694     3.8938441    2.3638453 ]
   [ 18.013        2.494126     1.5141152 ]
   [ 13.1349945    0.           0.        ]]

  [[  0.           0.           5.        ]
   [  7.151982     7.1072836    9.849935  ]
   [ 22.589865    17.673965    17.65457   ]
   ...
   [ 23.177933     5.6338344    4.089736  ]
   [ 20.740635     3.7300262    2.7194169 ]
   [ 16.359985     0.2999878    0.23999023]]

  [[  0.           0.           5.        ]
   [ 21.605       20.115       22.135     ]
   [ 50.07        44.64        42.7       ]
   ...
   [ 32.649994    14.649994    12.649994  ]
   [ 31.430038    13.430038    11.430038  ]
   [ 28.          10.           8.        ]]]


 [[[195.         194.         166.        ]
   [194.5        193.5        165.5       ]
   [193.         192.         164.        ]
   ...
   [154.         144.         108.        ]
   [154.         144.         108.        ]
   [151.         145.         111.        ]]

  [[195.         194.         166.        ]
   [194.5        193.5        165.5       ]
   [193.         192.         164.        ]
   ...
   [154.         144.         108.        ]
   [154.         144.         108.        ]
   [151.         145.         111.        ]]

  [[195.         194.         166.        ]
   [194.5        193.5        165.5       ]
   [193.         192.         164.        ]
   ...
   [155.         145.         109.        ]
   [155.         145.         109.        ]
   [152.         146.         110.52      ]]

  ...

  [[ 91.          80.          52.        ]
   [ 91.5         80.5         50.5       ]
   [ 94.          86.          50.        ]
   ...
   [ 87.69501     63.695007    35.695007  ]
   [ 92.          68.          42.        ]
   [ 94.109985    74.109985    41.109985  ]]

  [[ 89.73999     78.73999     50.73999   ]
   [ 90.5         79.5         49.5       ]
   [ 94.          86.          50.        ]
   ...
   [ 87.609985    63.609985    35.609985  ]
   [ 90.21997     66.21997     40.21997   ]
   [ 90.849976    70.849976    37.849976  ]]

  [[ 87.869995    76.869995    48.869995  ]
   [ 89.369995    78.369995    48.369995  ]
   [ 94.          86.          50.        ]
   ...
   [ 83.435       59.434998    31.434998  ]
   [ 84.869995    60.869995    34.869995  ]
   [ 87.435       67.435       34.434998  ]]]


 [[[ 22.          22.          32.        ]
   [ 33.535       32.535       39.545002  ]
   [ 10.060001    10.060001    10.080002  ]
   ...
   [118.72476    120.72476    132.72476   ]
   [132.96008    134.96008    146.96008   ]
   [123.960205   123.960205   131.9602    ]]

  [[ 60.1         60.1         70.1       ]
   [ 40.140324    39.140324    46.150326  ]
   [ 16.0429      16.034351    16.080002  ]
   ...
   [ 43.8519      46.8519      54.1419    ]
   [ 35.26266     38.26266     45.55266   ]
   [ 51.585205    50.875206    56.730206  ]]

  [[ 79.97        79.97        91.39      ]
   [ 36.090004    35.090004    42.100002  ]
   [ 18.8358      18.8258      20.8558    ]
   ...
   [ 40.415207    41.415207    45.705204  ]
   [ 32.694893    33.694893    37.984894  ]
   [ 58.731674    54.601673    58.891674  ]]

  ...

  [[100.869995    76.869995    72.869995  ]
   [119.936615    95.936615    91.936615  ]
   [167.12769    143.12769    139.12769   ]
   ...
   [149.20277    124.20277    119.20277   ]
   [152.75244    127.75244    122.75244   ]
   [127.0557     104.0557      98.0557    ]]

  [[128.51016    103.930145   101.38019   ]
   [128.22784    103.50427    101.821434  ]
   [126.68005    101.81003    100.420105  ]
   ...
   [141.86383    115.993805   111.43316   ]
   [143.90657    117.74654    114.776596  ]
   [134.34804    110.76802    105.05803   ]]

  [[143.13007    117.130066   118.130066  ]
   [123.900116    97.40511    101.39511   ]
   [126.49757     99.497574   104.497574  ]
   ...
   [158.9558     130.9558     127.47078   ]
   [156.62021    127.62021    129.62021   ]
   [137.70328    112.703285   107.703285  ]]]


 ...


 [[[145.         147.         142.        ]
   [143.425      145.425      140.425     ]
   [141.85       143.8        138.8       ]
   ...
   [131.52493    100.099945    82.22492   ]
   [138.34996    106.39996     91.14995   ]
   [139.         107.          92.        ]]

  [[145.         147.         142.        ]
   [143.425      145.425      140.425     ]
   [141.8775     143.8275     138.8275    ]
   ...
   [130.97493     99.54994     81.67492   ]
   [137.79996    105.84996     90.599945  ]
   [138.45       106.45        91.45      ]]

  [[145.         147.         142.        ]
   [143.425      145.425      140.425     ]
   [141.9        143.85       138.85      ]
   ...
   [130.42493     98.99995     81.23992   ]
   [137.24995    105.299965    90.05995   ]
   [137.9        105.9         90.9       ]]

  ...

  [[116.849945   116.849945   114.849945  ]
   [117.89995    117.89995    115.89995   ]
   [118.49494    118.49494    116.49494   ]
   ...
   [157.77997    161.77997    162.77997   ]
   [162.81996    166.81996    167.81996   ]
   [163.29999    167.29999    168.29999   ]]

  [[111.89999    111.89999    109.89999   ]
   [112.94999    112.94999    110.94999   ]
   [113.37998    113.37998    111.37998   ]
   ...
   [157.94498    161.94498    162.94498   ]
   [161.82997    165.82997    166.82997   ]
   [162.2        166.2        167.2       ]]

  [[111.         111.         109.        ]
   [112.05       112.05       110.05      ]
   [112.45       112.45       110.45      ]
   ...
   [157.97498    161.97498    162.97498   ]
   [161.64998    165.64998    166.64998   ]
   [162.         166.         167.        ]]]


 [[[225.         219.         207.        ]
   [214.         208.         196.        ]
   [213.         207.         195.        ]
   ...
   [221.85016    215.85016    199.85016   ]
   [212.99002    206.99002    190.99002   ]
   [216.         210.         198.        ]]

  [[213.015      207.015      195.015     ]
   [217.97       211.97       199.97      ]
   [219.02501    213.02501    201.02501   ]
   ...
   [217.87288    211.87288    195.87288   ]
   [216.9946     210.9946     194.9946    ]
   [215.         209.         197.        ]]

  [[220.09       214.09       202.09      ]
   [216.93544    210.93544    198.93544   ]
   [214.9997     208.9997     196.9997    ]
   ...
   [221.33514    215.33514    199.33514   ]
   [219.00926    213.00926    197.00926   ]
   [214.97       208.97       196.97      ]]

  ...

  [[226.78546    226.60553    226.56055   ]
   [237.53633    237.3564     236.36595   ]
   [233.38574    233.20581    231.16083   ]
   ...
   [229.70834    217.61838    204.63339   ]
   [222.8302     214.69525    195.73933   ]
   [212.04216    200.9522     168.9522    ]]

  [[187.28864    188.28864    190.28864   ]
   [169.6994     171.6994     170.6994    ]
   [163.03944    165.03944    162.05885   ]
   ...
   [236.34972    226.83562    213.29066   ]
   [217.00072    210.00072    191.00072   ]
   [214.48495    203.48495    171.48495   ]]

  [[146.98502    171.98502    152.98502   ]
   [159.41502    174.44504    159.43503   ]
   [169.93037    177.96037    164.95036   ]
   ...
   [228.96277    227.44778    217.29794   ]
   [227.82532    218.81534    187.77542   ]
   [208.52728    203.52728    174.52728   ]]]


 [[[145.         147.         142.        ]
   [143.425      145.425      140.425     ]
   [141.85       143.8        138.8       ]
   ...
   [131.52493    100.099945    82.22492   ]
   [138.34996    106.39996     91.14995   ]
   [139.         107.          92.        ]]

  [[145.         147.         142.        ]
   [143.425      145.425      140.425     ]
   [141.8775     143.8275     138.8275    ]
   ...
   [130.97493     99.54994     81.67492   ]
   [137.79996    105.84996     90.599945  ]
   [138.45       106.45        91.45      ]]

  [[145.         147.         142.        ]
   [143.425      145.425      140.425     ]
   [141.9        143.85       138.85      ]
   ...
   [130.42493     98.99995     81.23992   ]
   [137.24995    105.299965    90.05995   ]
   [137.9        105.9         90.9       ]]

  ...

  [[116.849945   116.849945   114.849945  ]
   [117.89995    117.89995    115.89995   ]
   [118.49494    118.49494    116.49494   ]
   ...
   [157.77997    161.77997    162.77997   ]
   [162.81996    166.81996    167.81996   ]
   [163.29999    167.29999    168.29999   ]]

  [[111.89999    111.89999    109.89999   ]
   [112.94999    112.94999    110.94999   ]
   [113.37998    113.37998    111.37998   ]
   ...
   [157.94498    161.94498    162.94498   ]
   [161.82997    165.82997    166.82997   ]
   [162.2        166.2        167.2       ]]

  [[111.         111.         109.        ]
   [112.05       112.05       110.05      ]
   [112.45       112.45       110.45      ]
   ...
   [157.97498    161.97498    162.97498   ]
   [161.64998    165.64998    166.64998   ]
   [162.         166.         167.        ]]]]

Process finished with exit code 0
输出结果
复制代码

小案例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
import tensorflow as tf
import os
 
 
class Cifar(object):
    """
    读取二进制文件的演示, 将二进制文件得到的数据存储到TFRecords 并且读取TFRecords数据
    """
    def __init__(self):
        self.height = 32
        self.width = 32
        self.channels = 3 # 彩色的图片
        self.label_bytes = 1
        self.image_bytes = self.height * self.width * self.channels
        # 每次需要读取的字节大小
        self.bytes = self.label_bytes + self.image_bytes
 
 
    def read_and_decode(self, file_names):
        """
        读取并且解码二进制的图片
        :return: 将批处理的图片和标签返回
        """
        # 构建文件队列 通过文件名的列表
        file_queue = tf.train.string_input_producer(file_names)
        # 创建文件读取器 并且指定每次读取的字节大小 为 self.bytes
        reader = tf.FixedLengthRecordReader(self.bytes)
        # 读取二进制文件数据的 uint8 一个字节  3073 = 1 + 3072
        key, value = reader.read(file_queue)
        # 完成对于二进制数据的解码操作
        label_image = tf.decode_raw(value, tf.uint8)
        # 在decode image 之前需要讲读取的3073个字节分割成 1 和 3072
        # 通过切片的方式获取3073个字符串中的第一个 是一个字符串类型
        # 截取的起始位置和长度需要通过一个一阶的张量来表示
        # 将self.bytes 切分为 self.label_bytes 和 self.image_bytes
        label = tf.cast(tf.slice(label_image, [0], [self.label_bytes]),tf.int32)
 
        image = tf.cast(tf.slice(label_image, [self.label_bytes], [self.image_bytes]), tf.int32)
        # 图片的字节为3072, 需要将这个3072个字节的形状重新设置 [32 * 32 * 3] ---> 就是图片的张量
        reshape_image = tf.reshape(image, shape=[32, 32, 3])
        # 由于读取的图片的形状都是一样的 就不需要做resize处理 就可以直接进行批处理的操作
        image_batch, label_batch = tf.train.batch([reshape_image, label], batch_size=100, num_threads=2, capacity=100)
 
        return image_batch, label_batch
 
 
    def save_to_tfrecords(self):
        """
        将读取的图片数据存储为tfrecord格式的文件
        :return:
        """
 
 
        return None
 
 
    def read_from_tfrecords(self):
        """
        从tfrecords格式的文件读取对应的数据
        :return:
        """
 
        return None
 
 
def call_cifar():
    # 获取某一个路径下的文件名
    file_names = os.listdir('./cifar-10-batches-bin/')
    file_names = [os.path.join('./cifar-10-batches-bin/', file_name) for file_name in file_names if file_name[-3:] == 'bin']
    # 创建对象 调用对象方法
    cifar = Cifar()
    batch_image, batch_label = cifar.read_and_decode(file_names)
    print("===========")
    print(batch_image, batch_label)
    # # 运行已经设定好的图
    with tf.Session() as sess:
        # 开启子线程执行
        coord = tf.train.Coordinator() # 线程协调器
        threads = tf.train.start_queue_runners(sess, coord=coord)
        ret = sess.run([batch_image, batch_label])
        print(ret)
        coord.request_stop()
        coord.join(threads)
 
 
if __name__ == '__main__':
    call_cifar()
输出结果

图片存储,计算的类型

存储:uint8(节约空间)

矩阵计算:float32(提高精度)

TFRecords分析、存取

  • TFRecords是Tensorflow设计的一种内置文件格式,是一种二进制文件

  • 它能更好的利用内存,更方便复制和移动为了将二进制数据和标签(训练的类别标签)数据存储在同一个文件中

TFRecords存储

1、建立TFRecord存储器

tf.python_io.TFRecordWriter(path),写入tfrecords文件

  • path: TFRecords文件的路径

  • return:写文件

method

  • write(record):向文件中写入一个字符串记录

  • close():关闭文件写入器

注:字符串为一个序列化的Example,Example.SerializeToString()

2、构造每个样本的Example协议块

tf.train.Example(features=None) 写入tfrecords文件

  • features:tf.train.Features类型的特征实例

  • return:example格式协议块

tf.train.Features(feature=None)

构建每个样本的信息键值对feature:字典数据,key为要保存的名字,

  • value为tf.train.Feature实例

  • return:Features类型

  • tf.train.Feature(**options)

**options:例如

  • bytes_list=tf.train. BytesList(value=[Bytes])

  • int64_list=tf.train. Int64List(value=[Value])

  • tf.train. Int64List(value=[Value])

  • tf.train. BytesList(value=[Bytes])

  • tf.train. FloatList(value=[value])

TFRecords读取方法

同文件阅读器流程,中间需要解析过程

解析TFRecords的example协议内存块

tf.parse_single_example(serialized,features=None,name=None) 解析一个单一的Example原型

  • serialized:标量字符串Tensor,一个序列化的Example

  • features:dict字典数据,键为读取的名字,值为FixedLenFeature

  • return:一个键值对组成的字典,键为读取的名字

  • tf.FixedLenFeature(shape,dtype)

  • shape:输入数据的形状,一般不指定,为空列表

  • dtype:输入数据类型,与存储进文件的类型要一致类型只能是float32,int64,string

CIFAR-10 批处理结果存入tfrecords流程

  • 1、构造存储器

  • 2、构造每一个样本的Example

  • 3、写入序列化的Example

读取tfrecords流程

  • 1、构造TFRecords阅读器

  • 2、解析Example

  • 3、转换格式,bytes解码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
"""
读取二进制文件转换成张量,写进TFRecords,同时读取TFRcords
"""
import tensorflow as tf
 
# 命令行参数
FLAGS = tf.app.flags.FLAGS  # 获取值
tf.app.flags.DEFINE_string("tfrecord_dir", "cifar10.tfrecords", "写入图片数据文件的文件名")
 
# 读取二进制转换文件
class CifarRead(object):
    """
    读取二进制文件转换成张量,写进TFRecords,同时读取TFRcords
    """
 
    def __init__(self, file_list):
        """
        初始化图片参数
        :param file_list:图片的路径名称列表
        """
 
        # 文件列表
        self.file_list = file_list
 
        # 图片大小,二进制文件字节数
        self.height = 32
        self.width = 32
        self.channel = 3
        self.label_bytes = 1
        self.image_bytes = self.height * self.width * self.channel
        self.bytes = self.label_bytes + self.image_bytes
 
    def read_and_decode(self):
        """
        解析二进制文件到张量
        :return: 批处理的image,label张量
        """
        # 1.构造文件队列
        file_queue = tf.train.string_input_producer(self.file_list)
 
        # 2.阅读器读取内容
        reader = tf.FixedLengthRecordReader(self.bytes)
 
        key, value = reader.read(file_queue)  # key为文件名,value为元组
 
        print(value)
 
        # 3.进行解码,处理格式
        label_image = tf.decode_raw(value, tf.uint8)
        print(label_image)
 
        # 处理格式,image,label
        # 进行切片处理,标签值
        # tf.cast()函数是转换数据格式,此处是将label二进制数据转换成int32格式
        label = tf.cast(tf.slice(label_image, [0], [self.label_bytes]), tf.int32)
 
        # 处理图片数据
        image = tf.slice(label_image, [self.label_bytes], [self.image_bytes])
        print(image)
 
        # 处理图片的形状,提供给批处理
        # 因为image的形状已经固定,此处形状用动态形状来改变
        image_tensor = tf.reshape(image, [self.height, self.width, self.channel])
        print(image_tensor)
 
        # 批处理图片数据
        image_batch, label_batch = tf.train.batch([image_tensor, label], batch_size=10, num_threads=1, capacity=10)
 
        return image_batch, label_batch
 
    def write_to_tfrecords(self, image_batch, label_batch):
        """
        将文件写入到TFRecords文件中
        :param image_batch:
        :param label_batch:
        :return:
        """
 
        # 建立TFRecords文件存储器
        writer = tf.python_io.TFRecordWriter('cifar10.tfrecords')  # 传进去命令行参数
 
        # 循环取出每个样本的值,构造example协议块
        for i in range(10):
            # 取出图片的值,  #写进去的是值,而不是tensor类型,
            # 写入example需要bytes文件格式,将tensor转化为bytes用tostring()来转化
            image = image_batch[i].eval().tostring()
 
            # 取出标签值,写入example中需要使用int形式,所以需要强制转换int
            label = int(label_batch[i].eval()[0])
 
            # 构造每个样本的example协议块
            example = tf.train.Example(features=tf.train.Features(feature={
                "image": tf.train.Feature(bytes_list=tf.train.BytesList(value=[image])),
                "label": tf.train.Feature(int64_list=tf.train.Int64List(value=[label]))
            }))
 
            # 写进去序列化后的值
            writer.write(example.SerializeToString())  # 此处其实是将其压缩成一个二进制数据
 
        writer.close()
 
        return None
 
    def read_from_tfrecords(self):
        """
        从TFRecords文件当中读取图片数据(解析example)
        :param self:
        :return: image_batch,label_batch
        """
 
        # 1.构造文件队列
        file_queue = tf.train.string_input_producer(['cifar10.tfrecords'])  # 参数为文件名列表
 
        # 2.构造阅读器
        reader = tf.TFRecordReader()
 
        key, value = reader.read(file_queue)
 
        # 3.解析协议块,返回的值是字典
        feature = tf.parse_single_example(value, features={
            "image": tf.FixedLenFeature([], tf.string),
            "label": tf.FixedLenFeature([], tf.int64)
        })
 
        # feature["image"],feature["label"]
        # 处理标签数据    ,cast()只能在int和float之间进行转换
        label = tf.cast(feature["label"], tf.int32)  # 将数据类型int64 转换为int32
 
        # 处理图片数据,由于是一个string,要进行解码,  #将字节转换为数字向量表示,字节为一字符串类型的张量
        # 如果之前用了tostring(),那么必须要用decode_raw()转换为最初的int类型
        # decode_raw()可以将数据从string,bytes转换为int,float类型的
        image = tf.decode_raw(feature["image"], tf.uint8)
 
        # 转换图片的形状,此处需要用动态形状进行转换
        image_tensor = tf.reshape(image, [self.height, self.width, self.channel])
 
        # 4.批处理
        image_batch, label_batch = tf.train.batch([image_tensor, label], batch_size=10, num_threads=1, capacity=10)
 
        return image_batch, label_batch
 
 
if __name__ == '__main__':
    # 找到文件路径,名字,构造路径+文件名的列表,"A.csv"...
    # os.listdir() 方法用于返回指定的文件夹包含的文件或文件夹的名字的列表
    import os
    file_names = os.listdir('./cifar-10-batches-bin/')
    file_list = [os.path.join('./cifar-10-batches-bin/', file_name) for file_name in file_names if file_name[-3:] == 'bin']
 
    # 初始化参数
    cr = CifarRead(file_list)
 
    # 读取二进制文件
    # image_batch, label_batch = cr.read_and_decode()
 
    # 从已经存储的TFRecords文件中解析出原始数据
    image_batch, label_batch = cr.read_from_tfrecords()
 
    with tf.Session() as sess:
        # 线程协调器
        coord = tf.train.Coordinator()
 
        # 开启线程
        threads = tf.train.start_queue_runners(sess, coord=coord)
 
        print(sess.run([image_batch, label_batch]))
 
        print("存进TFRecords文件")
        cr.write_to_tfrecords(image_batch,label_batch)
        print("存进文件完毕")
 
        # 回收线程
        coord.request_stop()
        coord.join(threads)

输出结果如下

复制代码
"C:\Program Files\Python36\python.exe" D:/数据分析/机器学习/day5/3-代码/tet.py
2020-01-22 17:51:59.917717: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2020-01-22 17:51:59.918717: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2020-01-22 17:51:59.918717: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2020-01-22 17:51:59.918717: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2020-01-22 17:51:59.919717: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2020-01-22 17:51:59.919717: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
[array([[[[115, 118, 121],
         [122, 124, 126],
         [129, 133, 136],
         ...,
         [156, 155, 155],
         [153, 147, 144],
         [143, 140, 139]],

        [[125, 125, 127],
         [130, 132, 135],
         [137, 140, 143],
         ...,
         [164, 164, 163],
         [161, 156, 153],
         [150, 148, 149]],

        [[135, 136, 138],
         [141, 143, 145],
         [147, 149, 151],
         ...,
         [174, 174, 174],
         [175, 177, 174],
         [167, 165, 167]],

        ...,

        [[112, 113, 113],
         [112, 113, 114],
         [114, 116, 116],
         ...,
         [188, 188, 185],
         [174, 140, 120],
         [111, 106, 107]],

        [[102, 105, 109],
         [111, 114, 116],
         [118, 118, 116],
         ...,
         [183, 150, 121],
         [115, 113, 110],
         [108, 103,  83]],

        [[109, 107, 106],
         [103, 100,  99],
         [102, 104, 111],
         ...,
         [119, 117, 111],
         [108, 109, 108],
         [ 91,  77,  70]]],


       [[[ 71,  70,  72],
         [ 77,  78,  78],
         [ 81,  85,  87],
         ...,
         [108, 114, 117],
         [107, 103, 108],
         [108, 101,  99]],

        [[ 93,  92, 106],
         [154, 168, 154],
         [173, 191, 194],
         ...,
         [114, 111, 111],
         [113, 115, 136],
         [187, 203, 163]],

        [[159, 163, 156],
         [175, 179, 179],
         [182, 178, 181],
         ...,
         [122, 123, 138],
         [164, 163, 158],
         [169, 155, 126]],

        ...,

        [[143, 106,  63],
         [ 43,  11,  17],
         [ 28,  38,  46],
         ...,
         [119, 129, 138],
         [150, 159, 155],
         [159, 167, 163]],

        [[122,  87,  86],
         [ 92,  66,  83],
         [112, 115, 123],
         ...,
         [ 95,  80,  75],
         [ 80,  87,  77],
         [ 94, 120, 142]],

        [[145, 144, 146],
         [151, 156, 157],
         [155, 159, 165],
         ...,
         [154, 155, 152],
         [147, 128, 130],
         [152, 155, 158]]],


       [[[ 55,  78,  90],
         [ 82,  75,  81],
         [105, 116, 135],
         ...,
         [154, 156, 175],
         [186, 196, 199],
         [204, 206, 201]],

        [[141, 136, 126],
         [126, 122,  94],
         [ 86,  86,  91],
         ...,
         [112, 129, 143],
         [185, 169, 141],
         [155, 178, 181]],

        [[118,  61,  79],
         [108, 129, 158],
         [128, 110, 122],
         ...,
         [147, 132, 124],
         [115,  96, 113],
         [112,  62,  97]],

        ...,

        [[ 93,  82,  98],
         [ 94,  94, 105],
         [118, 115,  98],
         ...,
         [ 71,  74,  77],
         [ 64,  71,  66],
         [ 62,  48,  41]],

        [[117, 110, 121],
         [123, 121,  99],
         [ 75,  48,  65],
         ...,
         [125, 107, 106],
         [104,  86,  86],
         [ 75,  50,  50]],

        [[100, 114, 153],
         [171, 146, 130],
         [ 87,  62,  70],
         ...,
         [186, 190, 197],
         [177, 180, 192],
         [184, 174, 172]]],


       ...,


       [[[176, 173, 167],
         [183, 223, 174],
         [181, 168, 168],
         ...,
         [129,  90, 125],
         [139, 139, 131],
         [116, 103, 113]],

        [[160, 162, 149],
         [156, 169, 150],
         [145, 141, 131],
         ...,
         [153, 137, 100],
         [127, 191, 154],
         [131, 129, 122]],

        [[144, 162, 165],
         [191, 204, 192],
         [191, 173, 168],
         ...,
         [151, 127,  59],
         [ 78, 167, 123],
         [ 86,  87,  75]],

        ...,

        [[139, 156, 164],
         [157, 149, 146],
         [144, 133, 119],
         ...,
         [122, 118, 121],
         [118, 118, 117],
         [116, 119, 122]],

        [[182, 149, 136],
         [134, 141, 143],
         [145, 140, 143],
         ...,
         [141, 154, 139],
         [137, 132, 132],
         [130, 128, 123]],

        [[127, 126, 143],
         [185, 213, 207],
         [168, 146, 144],
         ...,
         [136, 137, 126],
         [125, 121, 123],
         [125, 117, 115]]],


       [[[ 36,  41,  36],
         [ 33,  37,  35],
         [ 36,  44,  61],
         ...,
         [ 47,  44,  40],
         [ 36,  36,  30],
         [ 31,  26,  26]],

        [[ 29,  23,  27],
         [ 30,  30,  41],
         [ 46,  35,  33],
         ...,
         [ 60,  45,  41],
         [ 44,  35,  32],
         [ 31,  36,  36]],

        [[ 50,  49,  34],
         [ 29,  30,  44],
         [ 58,  59,  61],
         ...,
         [ 59,  53,  68],
         [ 49,  47,  55],
         [ 86,  46,  31]],

        ...,

        [[ 29,  64,  96],
         [131, 138, 130],
         [159, 182, 129],
         ...,
         [ 67,  61,  57],
         [ 28,  12,  32],
         [ 39,  32,  32]],

        [[  8,  26,  40],
         [ 82, 112, 132],
         [164, 175, 134],
         ...,
         [ 66,  21,   2],
         [  2,   3,  21],
         [ 25,  27,  34]],

        [[  9,  23,  23],
         [ 42,  72, 118],
         [150, 136, 132],
         ...,
         [  5,   3,   2],
         [  3,   9,  16],
         [ 17,  25,  23]]],


       [[[255, 253, 253],
         [254, 254, 253],
         [253, 253, 253],
         ...,
         [253, 251, 251],
         [251, 251, 251],
         [251, 251, 253]],

        [[255, 253, 227],
         [214, 209, 199],
         [198, 199, 199],
         ...,
         [ 22,  15,   1],
         [  2,   1,   0],
         [ 57, 158, 213]],

        [[255, 249, 137],
         [ 37,  36,  29],
         [ 43,  39,  25],
         ...,
         [  8,  10,  19],
         [  9,  10,   3],
         [ 45, 104, 187]],

        ...,

        [[255, 248, 132],
         [ 31,  30,  26],
         [ 27,  20,  16],
         ...,
         [ 30,  28,  35],
         [ 23,  19,  24],
         [ 51,  94, 184]],

        [[255, 247, 129],
         [ 23,  28,  22],
         [ 18,  24,  27],
         ...,
         [ 85,  90,  78],
         [ 68,  71,  73],
         [ 86, 122, 198]],

        [[255, 248, 223],
         [161, 116, 102],
         [100, 101, 102],
         ...,
         [222, 222, 222],
         [222, 222, 222],
         [224, 234, 246]]]], dtype=uint8), array([0, 7, 4, 2, 5, 3, 0, 4, 1, 3])]
存进TFRecords文件
View Code
复制代码
posted @   Crazymagic  阅读(253)  评论(0编辑  收藏  举报
编辑推荐:
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
阅读排行:
· 终于写完轮子一部分:tcp代理 了,记录一下
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
点击右上角即可分享
微信分享提示

目录导航