Linked Questions

Popular Questions

Altough tensorflow recommends very much to not use deprecated functions that are going to be replaced by tf.data objects, there seems to be no good documentation for cleanly replacing the deprecated for the modern approach. Moreover, Tensorflow tutorials still use the deprecated functionality to treat file processing (Reading data tutorial: https://www.tensorflow.org/api_guides/python/reading_data).

On the other hand, though there is good documentation for using the 'modern' approach (Importing data tutorial: https://www.tensorflow.org/guide/datasets), there still exists the old tutorials which will probably lead many, as me, to use the deprecated one first. That is why one would like to cleanly translate the deprecated to the 'modern' approach, and an example for this translation would probably be very useful.

#!/usr/bin/env python3
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import shutil
import os

if not os.path.exists('example'):
    shutil.rmTree('example');
    os.mkdir('example');

batch_sz = 10; epochs = 2; buffer_size = 30; samples = 0;
for i in range(50):
    _x = np.random.randint(0, 256, (10, 10, 3), np.uint8);
    plt.imsave("example/image_{}.jpg".format(i), _x)
images = tf.train.match_filenames_once('example/*.jpg')
fname_q = tf.train.string_input_producer(images,epochs, True);
reader = tf.WholeFileReader()
_, value = reader.read(fname_q)
img = tf.image.decode_image(value)
img_batch = tf.train.batch([img], batch_sz, shapes=([10, 10, 3]));
with tf.Session() as sess:
    sess.run([tf.global_variables_initializer(),
        tf.local_variables_initializer()])
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)
    for _ in range(epochs):
        try:
            while not coord.should_stop():
                sess.run(img_batch)
                samples += batch_sz;
                print(samples, "samples have been seen")
        except tf.errors.OutOfRangeError:
            print('Done training -- epoch limit reached')
        finally:
            coord.request_stop();
    coord.join(threads)

This code runs perfectly well for me, printing to console:

10 samples have been seen
20 samples have been seen
30 samples have been seen
40 samples have been seen
50 samples have been seen
60 samples have been seen
70 samples have been seen
80 samples have been seen
90 samples have been seen
100 samples have been seen
110 samples have been seen
120 samples have been seen
130 samples have been seen
140 samples have been seen
150 samples have been seen
160 samples have been seen
170 samples have been seen
180 samples have been seen
190 samples have been seen
200 samples have been seen
Done training -- epoch limit reached

As can be seen, it uses deprecated functions and objects as tf.train.string_input_producer() and tf.WholeFileReader(). An equivalent implementation using the 'modern' tf.data.Dataset is needed.

EDIT:

Found already given example for importing CSV data: Replacing Queue-based input pipelines with tf.data. I would like to be as complete as possible here, and suppose that more examples are better, so I don't feel it as a repeated question.

Related Questions