Download particular source file from AzureML snapshot directly

97 views Asked by At

The Azure Machine Learning SDK for Python provides the restore_snapshot() method to download a zip file with the source code of an AzureML run. Assume that I am interested in inspecting a certain source file (programmatically), of which I already know the path. In that case I would prefer not to have to download the whole zip file, extract and locate the file but rather have a direct way to download just one file from the snapshotted source code. Is that possible?

1

There are 1 answers

0
Rishabh Meshram On

Currently, there is no direct way to download a single file from the snapshotted source code of an AzureML run. The only way to do this is to download the whole zip file, extract it, and then locate the file that you want to download. However, one possible alternative can be by using the URI of the file and downloading the blob from storage with connection string.

enter image description here Below is sample code you can refer and modify according to your requirements:

import urllib
# Parse the URI to get the container and blob name
uri = "Your File URI"
parsed_uri = urllib.parse.urlparse(uri)

# Extract the container name.
container_name = parsed_uri.path.split("/")[1]

# Extract the blob name.
blob_name = parsed_uri.path.split("/")[-1]

# Download the blob content and save it as a .py file
from azure.storage.blob import BlobServiceClient
import os

# Get the connection string for your storage account.
connection_string = "Your storage account connection string"

# Create a BlobServiceClient object using the connection string.
blob_service_client = BlobServiceClient.from_connection_string(connection_string)

# Get a reference to the container that contains the file you want to download.
container_client = blob_service_client.get_container_client(container_name)

# Get a reference to the blob that represents the file you want to download.
blob_client = container_client.get_blob_client(blob_name)

# Download the file to a local directory.
local_path = "outputs"
with open(os.path.join(local_path, blob_name), "wb") as f:
    data = blob_client.download_blob()
    data.readinto(f)

# Rename the file to have a .py extension.
os.rename(os.path.join(local_path, blob_name), os.path.join(local_path, blob_name + ".py"))

With above code I was able to access a particular file from code.