Autglouon Multiabel Predictor Issues: cannot access to a trained model by Airflow

35 views Asked by At

I have trained a model 'Multilabel Predictor' of Autogluon in my local computer. I need to run a airflow pipeline to predict the data and store predictions in a table in redshift. The issue with the model stored in my computer is that the pickle file has the hardcore path of my computer (screenshot 1: first line of the pickle file), so when airflow tries to predict, theres an error that the path cannot be recognized. Due this situation, i've trained the same model in SageMaker and i stored it in a path of S3. When i try to predict the model (the one stored in s3), theres another error that botocore cant locate the credentials. (screenshot 2: logs error airflow).

Please, can you provide me any information of what can i do to do a airflow pipeline with the multilabel predictor of autogluon, i already did this for tabular predictor and it worked perfect.

enter image description here ScreenShot 1 enter image description here ScreenShot 2

When i trained model in Sagemaker with autogluon multilabel predictor, i expected that with this method, i could access to a trained model and Airflow would able to make the predictions. What it actually happend is that airflow returns an error that cannot locate the credentials.

The 'usual' way of loading a previous model is:

model = MultilabelPredictor.load(some str path) 

, where you can put the path of s3 inside load, i tried putting the path as

model = MultilabelPredictor.load('s3://my-bucket/model') 

but it returns the error that it cant locate the credentials, so wait i did is:

s3_prefix = "some prefix"

s3res = session.resource('s3')

buffer = s3res.Bucket('my-bucket').Object(s3_prefix).get()["Body"].read() 
data_s3 = CustomUnpickler(io.BytesIO(buffer)).load() 
y_pred = data_s3.predict(df_train)

where CustomUnpickler is

class CustomUnpickler(pickle.Unpickler):

def find_class(self, module, name):
    if name == 'MultilabelPredictor':
        from my_module import MultilabelPredictor
        return MultilabelPredictor
    return super().find_class(module, name)

and i get the same error of credentials

i will put the link of autogluon multilabel predictor here = https://auto.gluon.ai/0.3.1/tutorials/tabular_prediction/tabular-multilabel.html

0

There are 0 answers