Reading file from Azure blob storage using pickle or dill without saving to disk

5.9k views Asked by At

I'm trying to read weights for a machine learning model from Azure Storage Blob in Python. This should be running in Azure Functions, so I don't believe I'm able to use methods which save the blob to disk.

I'm using azure-storage-blob 12.5.0, not the legacy version.

I've tried using Dill.loads to load the .pkl file, like so:

connection_string = 'my_connection_string'
blob_client = BlobClient.from_connection_string(connection_string, container_name, blob_name)
downloader = blob_client.download_blob(0)

with BytesIO() as f:
    downloader.readinto(f)
    weights = dill.loads(f)

Which returns:

>>> TypeError: a bytes-like object is required, not '_io.BytesIO'

I'm not sure how the approach using Pickle would be. How could this be solved?

2

There are 2 answers

1
Samuel On BEST ANSWER

Here is how this problem was solved:

def get_weights_blob(blob_name):
    connection_string = 'my_connection_string'
    blob_client = BlobClient.from_connection_string(connection_string, container_name, blob_name)
    downloader = blob_client.download_blob(0)

    # Load to pickle
    b = downloader.readall()
    weights = pickle.loads(b)

    return weights

And then retrieving weights by using the function:

weights = get_weights_blob(blob_name = 'myPickleFile')
1
Manoj Alwis On

This is my working sample

def main(req: func.HttpRequest) -> func.HttpResponse:

 connection_string = ''
    blob_client = BlobClient.from_connection_string(connection_string, 'blog-storage-containe', 'blobfile')
    downloader = blob_client.download_blob(0)

b = downloader.readall()
loaded_model = pickle.loads(b)

And requirements.txt file

azure-functions
numpy
joblib
azure-storage-blob
sklearn