Translation Model Predictions: TypeError: Object of type 'EagerTensor' is not JSON serializable

6.7k views Asked by At

I have followed the translation colab notebook tutorial as suggested by Google's tensor2tensor repository

After exporting the model and uploading it to Google's AI Platform engine for online prediction, I am having trouble making requests to the model.

I believe the input to the translation model is a tensor of the source text. But I am receiving an error that TypeError: Object of type 'EagerTensor' is not JSON serializable


def encode(input_str, output_str=None):
  """Input str to features dict, ready for inference"""
  inputs = encoders["inputs"].encode(input_str) + [1]  # add EOS id
  batch_inputs = tf.reshape(inputs, [1, -1, 1])  # Make it 3D.
  return {"inputs": batch_inputs}

enfr_problem = problems.problem(PROBLEM)
encoders = enfr_problem.feature_encoders(DATA_DIR)

encoded_inputs = encode("Some text")
model_output = predict_json('project_name','model_name', encoded_inputs,'version_1')["outputs"]

I've tried converting the tensor to numpy but still no luck. Could someone point me in the right direction?

2

There are 2 answers

2
ccssmnn On

The problem is that TensorFlow returns an EagerTensor when you do:

inputs = encoders["inputs"].encode(input_str) + [1]  # add EOS id
batch_inputs = tf.reshape(inputs, [1, -1, 1])

And EagerTensor cannot be converted to JSON. Unfortunately an 3D numpy array cannot be converted to JSON too. But numpy arrays can be converted to lists easily. An example:

import json
import numpy as np
import tensorflow as tf

a = np.array([1, 2, 3])
b = np.array([1, 2, 3])
c = tf.multiply(a, b)

print(c)  # -> <tf.Tensor: shape=(3,), dtype=int64, numpy=array([1, 4, 9])>
print(c.numpy())  # -> array([1, 4, 9])
print(c.numpy().tolist())  # -> [1, 4, 9]

with open("example.json", "w") as f:
   json.dump(c, f)  # TypeError: Object of type EagerTensor is not JSON serializable
   json.dump(c.numpy(), f)  # TypeError: Object of type ndarray is not JSON serializable
   json.dump(c.numpy().tolist(), f)  # works!

I cannot provide an example for your exact case because your code snippet is not complete enough. But

return {"inputs": batch_inputs.numpy().tolist()}

should do the job.

0
Charlie Parker On

If you want to save your tensor data in a dict to a JSON file one simple solution is to recursively go into your dictionary and use the right function to convert your data into something serializable in Json (e.g. a string if it's just for saving the string). I am sure tensorflow must have a way to save your data as pickle files if that's what you really want to do (i.e. save your data).

The following code works for converting the things inside your dict recursively into a string but you should be able to modify and numpify, jsonify, etc the code easily depending your use case. My use case was saving the data in a format that was human readable (and not just torch.save):

#%%

def _to_json_dict_with_strings(dictionary):
    """
    Convert dict to dict with leafs only being strings. So it recursively makes keys to strings
    if they are not dictionaries.

    Use case:
        - saving dictionary of tensors (convert the tensors to strins!)
        - saving arguments from script (e.g. argparse) for it to be pretty

    e.g.

    """
    if type(dictionary) != dict:
        return str(dictionary)
    d = {k: _to_json_dict_with_strings(v) for k, v in dictionary.items()}
    return d

def to_json(dic):
    import types
    import argparse

    if type(dic) is dict:
        dic = dict(dic)
    else:
        dic = dic.__dict__
    return _to_json_dict_with_strings(dic)

def save_to_json_pretty(dic, path, mode='w', indent=4, sort_keys=True):
    import json

    with open(path, mode) as f:
        json.dump(to_json(dic), f, indent=indent, sort_keys=sort_keys)

def my_pprint(dic):
    """

    @param dic:
    @return:

    Note: this is not the same as pprint.
    """
    import json

    # make all keys strings recursively with their naitve str function
    dic = to_json(dic)
    # pretty print
    pretty_dic = json.dumps(dic, indent=4, sort_keys=True)
    print(pretty_dic)
    # print(json.dumps(dic, indent=4, sort_keys=True))
    # return pretty_dic

import torch
# import json  # results in non serializabe errors for torch.Tensors
from pprint import pprint

dic = {'x': torch.randn(1, 3), 'rec': {'y': torch.randn(1, 3)}}

my_pprint(dic)
pprint(dic)

output:

{
    "rec": {
        "y": "tensor([[-0.3137,  0.3138,  1.2894]])"
    },
    "x": "tensor([[-1.5909,  0.0516, -1.5445]])"
}
{'rec': {'y': tensor([[-0.3137,  0.3138,  1.2894]])},
 'x': tensor([[-1.5909,  0.0516, -1.5445]])}

Related links:

https://discuss.pytorch.org/t/typeerror-tensor-is-not-json-serializable/36065/3 or How to prettyprint a JSON file? and https://github.com/fossasia/visdom/issues/554.