Running Goofys in the foreground is the only way to prevent transport endpoint disconnections from occuring

328 views Asked by At

I am mounting a Google Cloud Storage bucket with goofys (fuse) to my docker container and running deep learning training.

The training data is ~10k datapoints and test ~600 datapoints. Between model fitting and testing, I rsync the model into the bucket since the bucket doesn't support directly writing HDF5 format, but that's something else. When running goofys in the background, the application crashes when attempting to access the test data. So it handles all the training data just fine and copies the model over to the bucket. When the testing begins, I am met with the output:

Transport endpoint is not connected

I am mounting with the following command (for background mounting, foreground has the -f flag):

goofys -o allow_other --profile <profilename> --stat-cache-ttl 10s --type-cache-ttl 10s --endpoint <endpnt> --dir-mode 0777 --file-mode 0777 <source_bucket> /mnt/<src_dir>

If I mount the bucket in the foreground, the application runs through in its entirety.

I don't understand why it would work in the foreground, but not in the background. What difference is there between the daemon versus the goofys app when there is a very high throughput of data?

goofys version: goofys version 0.24.0-45b8d78375af1b24604439d2e60c567654bcdf88

1

There are 1 answers

0
Snow24 On

The solution to this was to replace goofys with GCSFuse. Despite the high-performance perks of goofys, GCSFuse showed more reliability for keeping buckets mounted.