Getting error on ML-Engine predict but local predict works fine

Question

Getting error on ML-Engine predict but local predict works fine

1.3k views Asked by Victor Torres At 30 August 2017 at 06:24

I have searched a lot here but unfortunately could not find an answer.

I am running TensorFlow 1.3 (installed via PiP on MacOS) on my local machine, and have created a model using the provided "ssd_mobilenet_v1_coco" checkpoints.

I managed to train locally and on the ML-Engine (Runtime 1.2), and successfully deployed my savedModel to the ML-Engine.

Local predictions (below code) work fine and I get the model results

gcloud ml-engine local predict --model-dir=... --json-instances=request.json

 FILE request.json: {"inputs": [[[242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 23]]]}

However when deploying the model and trying to run on the ML-ENGINE for remote predictions with the code below:

gcloud ml-engine predict --model "testModel" --json-instances request.json(SAME JSON FILE AS BEFORE)

I get this error:

{
  "error": "Prediction failed: Exception during model execution: AbortionError(code=StatusCode.INVALID_ARGUMENT, details=\"NodeDef mentions attr 'data_format' not in Op<name=DepthwiseConv2dNative; signature=input:T, filter:T -> output:T; attr=T:type,allowed=[DT_FLOAT, DT_DOUBLE]; attr=strides:list(int); attr=padding:string,allowed=[\"SAME\", \"VALID\"]>; NodeDef: FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/depthwise = DepthwiseConv2dNative[T=DT_FLOAT, _output_shapes=[[-1,150,150,32]], data_format=\"NHWC\", padding=\"SAME\", strides=[1, 1, 1, 1], _device=\"/job:localhost/replica:0/task:0/cpu:0\"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/Relu6, FeatureExtractor/MobilenetV1/Conv2d_1_depthwise/depthwise_weights/read)\n\t [[Node: FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/depthwise = DepthwiseConv2dNative[T=DT_FLOAT, _output_shapes=[[-1,150,150,32]], data_format=\"NHWC\", padding=\"SAME\", strides=[1, 1, 1, 1], _device=\"/job:localhost/replica:0/task:0/cpu:0\"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/Relu6, FeatureExtractor/MobilenetV1/Conv2d_1_depthwise/depthwise_weights/read)]]\")"
}

I saw something similar here: https://github.com/tensorflow/models/issues/1581

About the problem being with the "data-format" parameter. But unfortunately I could not use that solution since I am already on TensorFlow 1.3.

It also seems that it might be a problem with MobilenetV1: https:// github.com/ tensorflow/models/issues/2153

Any ideas?

Original Q&A

There are 2 answers

wcyn On 03 January 2018 at 15:47

If you're wondering how to ensure that your model version is running the correct tensorflow version that you need to run, first have a look at this model versions list page

You need to know which model version supports the Tensorflow version that you need. At the time of writing:

ML version 1.4 supports TensorFlow 1.4.0 and 1.4.1
ML version 1.2 supports TensorFlow 1.2.0 and
ML version 1.0 supports TensorFlow 1.0.1

Now that you know which model version you require, you need to create a new version from your model, like so:

gcloud ml-engine versions create <version name> \
--model=<Name of the model> \
--origin=<Model bucket link. It starts with gs://...> \
--runtime-version=1.4

In my case, I needed to predict using Tensorflow 1.4.1, so I used the runtime version 1.4.

Refer to this official MNIST tutorial page, as well as this ML Versioning Page

**Vikas** · Accepted Answer · 2017-11-26T06:51:00+00:00

Vikas On 26 November 2017 at 06:51 BEST ANSWER

I had a similar issue. This issue is due to mismatch in Tensorflow versions used for training and inference. I solved the issue by using Tensorflow - 1.4 for both training and inference.

Please refer to this answer.

TechQA.

Getting error on ML-Engine predict but local predict works fine

There are 2 answers

Related Questions in TENSORFLOW

Related Questions in GOOGLE-PREDICTION

Popular Questions

Popular Tags

Trending Questions