tf.train.MonitoredTrainingSession arguments

1.5k views Asked by At

what arguments does config=None take in tf.train.MonitoredTrainingSession?. How can I specify the master node (at for eg localhost:2222) with the proper syntax?

Below is is the error I am encountering when i use config = 'grpc://localhost:2222' :-

Traceback (most recent call last):
  File "add_1.py", line 36, in <module>
    scaffold=None, hooks=[saver_hook, summary_hook], chief_only_hooks=None, save_checkpoint_secs=10, save_summaries_steps=None, config='grpc://localhost:2222') as sess:
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 289, in MonitoredTrainingSession
    return MonitoredSession(session_creator=session_creator, hooks=hooks)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 447, in __init__
    self._sess = _RecoverableSession(self._coordinated_creator)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 618, in __init__
    _WrappedSession.__init__(self, self._sess_creator.create_session())
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 505, in create_session
    self.tf_sess = self._session_creator.create_session()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 341, in create_session
    init_fn=self._scaffold.init_fn)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/session_manager.py", line 227, in prepare_session
    config=config)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/session_manager.py", line 153, in _restore_checkpoint
    sess = session.Session(self._target, graph=self._graph, config=config)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1186, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 540, in __init__
    % type(config))
TypeError: config must be a tf.ConfigProto, but got <type 'str'>
Exception AttributeError: "'Session' object has no attribute '_session'" in <bound method Session.__del__ of <tensorflow.python.client.session.Session object at 0x7fc540937ed0>> ignored
1

There are 1 answers

1
mrry On BEST ANSWER

The config argument to tf.train.MonitoredTrainingSession takes a tf.ConfigProto protocol buffer message.

It looks like you should actually pass your argument ("grpc://localhost:2222") as the master argument, which takes the same values as the target argument to the tf.Session initializer: e.g. "" means "in-process runtime", and "grpc://localhost:2222" means "the gRPC-based tf.train.Server listening on localhost:2222.