How to process a file located in Azure blob Storage using python with pandas read_fwf function

1.2k views Asked by At

I need to open and work on data coming in a text file with python. The file will be stored in the Azure Blob storage or Azure file share.

However, my question is can I use the same modules and functions like os.chdir() and read_fwf() I was using in windows? The code I wanted to run:

import pandas as pd
import os
os.chdir( file_path)
df=pd.read_fwf(filename)

I want to be able to run this code and file_path would be a directory in Azure blob.

Please let me know if it's possible. If you have a better idea where the file can be stored please share.

Thanks,

1

There are 1 answers

0
Frank Borzage On BEST ANSWER

As far as I know, os.chdir(path) can only operate on local files. If you want to move files from storage to local, you can refer to the following code:

    connect_str = "<your-connection-string>"
    blob_service_client = BlobServiceClient.from_connection_string(connect_str)
    container_name = "<container-name>"
    file_name = "<blob-name>"
    container_client = blob_service_client.get_container_client(container_name)
    blob_client = container_client.get_blob_client(file_name)
    download_file_path = "<local-path>"
    with open(download_file_path, "wb") as download_file:
        download_file.write(blob_client.download_blob().readall())

pandas.read_fwf can read blob directly from storage using URL:

enter image description here

For example:

    url = "https://<your-account>.blob.core.windows.net/test/test.txt?<sas-token>"
    df=pd.read_fwf(url)