H2O-3 AI Cannot Import Model from Google Cloud Storage to Cluster

204 views Asked by At

I've been trying to run a saved H2O models on Google H2O Cluster for the past few days.

I was able to deploy and connect to the cluster using this guide http://docs.h2o.ai/h2o/latest-stable/h2o-docs/cloud-integration/google-compute.html

h2o.cluster().show_status()

H2O_cluster_uptime: 4 hours 38 mins
H2O_cluster_timezone: Etc/UTC
H2O_data_parsing_timezone: UTC
H2O_cluster_version: 3.32.1.2
H2O_cluster_version_age: 12 days
H2O_cluster_name: root
H2O_cluster_total_nodes: 1
H2O_cluster_free_memory: 6.220 Gb
H2O_cluster_total_cores: 2
H2O_cluster_allowed_cores: 2
H2O_cluster_status: locked, healthy

I uploaded saved model on to Google Cloud Storage and fuse to the VM using Cloud Storage FUSE to this folder

/tmp/gcsModels/

Now, whenever I try to load the model using .load_model:

models_path = "/tmp/gcsModels/serverless/v1/"
pca_model = h2o.load_model(os.path.join(models_path, "cust_PCA_DEMO_v1"))

I encounter this error:

H2OResponseError: Server error water.exceptions.H2OIllegalArgumentException:
  Error: Illegal argument: dir of function: importModel: water.api.FSIOException: FS IO Failure: 
 accessed path : file:/tmp/gcsModels/serverless/v1/cust_PCA_DEMO_v1 msg: File not found
  Request: POST /99/Models.bin/
    data: {'dir': '/tmp/gcsModels/serverless/v1/cust_PCA_DEMO_v1'}

Upon checking, the models file are all in the /tmp/gcsModels folder

ls /tmp/gcsModels/serverless/v1/

cust_GBM_DEMO_LIKELIHOOD_v2
cust_GBM_DEMO_LIKELIHOOD_v2_cv5
cust_GBM_DEMO_LOGAMOUNT_v1_cv5
cust_PCA_DEMO_v1

I have no idea what I did wrong. Any ideas would be greatly appreciated.

1

There are 1 answers

1
Neema Mashayekhi On

Your Python client may be hosted at a different location than your H2O server. When you connect with h2o.connect(url="https://[external ip]:54321", auth=(username, password)), you are specifying an external IP address. So what you see with ls will be at a different location.

Your error message shows that the file is not found on the file system that Python is running at:

accessed path : file:/tmp/gcsModels/serverless/v1/cust_PCA_DEMO_v1 msg: File not found.

Try using gs:// to specify that the file location will be on Google Storage. I can't tell what your exact path is, but I would expect it something like:

h2o.load_model("gs://<BUCKETNAME>/gcsModels/serverless/v1/cust_PCA_DEMO_v1")