gcloud ml-engine local predict --text-instances fails with "Could not parse" error

981 views Asked by At

I'm trying to make the tensorflow boston sample (https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/tutorials/input_fn) work on google cloudml and I seem to be successfull with the training, but I struggle with the subsequent predictions.

  1. I've tweaked the code to fit with tf.contrib.learn.Experiment and learn_runner.run(). It runs both locally and in the cloud with "gcloud ml-engine local train ..."/"gcloud ml-engine jobs submit training ...".

  2. I can with the trained model run estimator.predict(input_fn=predict_input_fn)) and get meaningful predictions with the given boston_predict.csv set.

  3. I can create and version the model in the cloud with "gcloud ml-engine models create ..." and "gcloud ml-engine versions create ..."

But

  1. Local predictions over "gcloud ml-engine local predict --model-dir=/export/Servo/XXX --text-instances boston_predict.csv" fails with a "InvalidArgumentError (see above for traceback): Could not parse example input <..> (Error code: 2). See below for transcript. It fails similarly with a headerless boston_predict.csv.

I've looked up the expected format with "$ gcloud ml-engine local predict --help ", read the https://cloud.google.com/ml-engine/docs/how-tos/troubleshooting but in general failed to find via google or stackexhange reports of my specific error.

I'm a noob, so I'm probably erring in some basic way, but I cannot spot it.

All and any help is appreciated,

:-)

yarc68000.

-------environment----------

(env1) $ gcloud --version
Google Cloud SDK 170.0.0
alpha 2017.03.24
beta 2017.03.24
bq 2.0.25
core 2017.09.01
datalab 20170818
gcloud 
gsutil 4.27

(env1) $ python --version
Python 2.7.13 :: Anaconda 4.3.1 (64-bit)

(env1) $ conda list | grep tensorflow
tensorflow                1.3.0                     <pip>
tensorflow-tensorboard    0.1.6                     <pip>

------------execution and error : boston_predict.csv ----------

$ gcloud ml-engine local predict --model-dir=<..>/export/Servo/1504780684 --text-instances 1709boston/boston_predict.csv
<..>
ERROR:root:Exception during running the graph: Could not parse example input, value: 'CRIM,ZN,INDUS,NOX,RM,AGE,DIS,TAX,PTRATIO'
[[Node: ParseExample/ParseExample = ParseExample[Ndense=9, Nsparse=0, Tdense=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], dense_shapes=[[1], [1], [1], [1], [1], [1], [1], [1], [1]], sparse_types=[], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_Placeholder_0_0, ParseExample/ParseExample/names, ParseExample/ParseExample/dense_keys_0, ParseExample/ParseExample/dense_keys_1, ParseExample/ParseExample/dense_keys_2, ParseExample/ParseExample/dense_keys_3, ParseExample/ParseExample/dense_keys_4, ParseExample/ParseExample/dense_keys_5, ParseExample/ParseExample/dense_keys_6, ParseExample/ParseExample/dense_keys_7, ParseExample/ParseExample/dense_keys_8, ParseExample/Const, ParseExample/Const_1, ParseExample/Const_2, ParseExample/Const_3, ParseExample/Const_4, ParseExample/Const_5, ParseExample/Const_6, ParseExample/Const_7, ParseExample/Const_8)]]
<..>

------- execution and error headerless boston_predict.csv ------

(here I try with a boston_predict.csv with the first line omitted)

$ gcloud ml-engine local predict --model-dir=<..>/export/Servo/1504780684 --text-instances 1709boston/boston_predict_headerless.csv
<..>
ERROR:root:Exception during running the graph: Could not parse example input, value: '0.03359,75.0,2.95,0.428,7.024,15.8,5.4011,252,18.3'
[[Node: ParseExample/ParseExample = ParseExample[Ndense=9, Nsparse=0, Tdense=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], dense_shapes=[[1], [1], [1], [1], [1], [1], [1], [1], [1]], sparse_types=[], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_Placeholder_0_0, ParseExample/ParseExample/names, ParseExample/ParseExample/dense_keys_0, ParseExample/ParseExample/dense_keys_1, ParseExample/ParseExample/dense_keys_2, ParseExample/ParseExample/dense_keys_3, ParseExample/ParseExample/dense_keys_4, ParseExample/ParseExample/dense_keys_5, ParseExample/ParseExample/dense_keys_6, ParseExample/ParseExample/dense_keys_7, ParseExample/ParseExample/dense_keys_8, ParseExample/Const, ParseExample/Const_1, ParseExample/Const_2, ParseExample/Const_3, ParseExample/Const_4, ParseExample/Const_5, ParseExample/Const_6, ParseExample/Const_7, ParseExample/Const_8)]]
<..>
1

There are 1 answers

2
rhaertel80 On BEST ANSWER

There are likely two problems.

First, it looks as though the graph that you are exporting is expecting tf.Example protos as input, i.e. has a parse_example(...) op in it. The Boston sample does not appear to be adding that op, so I suspect that is part of your modifications.

Before showing the code you want for the input_fn, we need to talk about the second problem: versioning. Estimators existed in previous versions of TensorFlow under tensorflow.contrib. However, various parts have migrated into tensorflow.estimator with successive TensorFlow versions and the APIs have changed as they've moved.

CloudML Engine currently (as of 07 Sep 2017) only supports TF 1.0 and 1.2, so I'll provide a solution that works with 1.2. This is based on the census sample. This is the input_fn you need in order to use CSV data, although I generally recommend exporting models that are independent of input format:

# Provides the data types for the various columns.
FEATURE_DEFAULTS=[[0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0], [0.0]]

def predict_input_fn(rows_string_tensor):
  # Takes a rank-1 tensor and converts it into rank-2 tensor
  # Example if the data is ['csv,line,1', 'csv,line,2', ..] to
  # [['csv,line,1'], ['csv,line,2']] which after parsing will result in a
  # tuple of tensors: [['csv'], ['csv']], [['line'], ['line']], [[1], [2]]
  row_columns = tf.expand_dims(rows_string_tensor, -1)
  columns = tf.decode_csv(row_columns, record_defaults=FEATURE_DEFAULTS)
  features = dict(zip(FEATURES, columns))

  return tf.contrib.learn.InputFnOps(features, None, {'csv_row': csv_row})

And you'll need an export strategy like this:

saved_model_export_utils.make_export_strategy(
    predict_input_fn,
    exports_to_keep=1,
    default_output_alternative_key=None,
)

which you'll pass as a list of size 1 to the constructor of tf.contrib.learn.Experiment.