I am trying to read a csv file (encrypted) from azure blob storage and decrypt it using gnupg and read it in python. I am able to access the blob file but when I pass it to dcrypt function it throws an error.
Error: expected str, bytes or os.PathLike object, not StorageStreamDownloader
The blob file is StorageStreamDownloader type. When I convert it to Bytes I get "Embedded Null bytes" error.
Can someone help me with this. Below is my code.
from azure.storage.blob import BlobServiceClient, BlobClient
import pandas as pd
import csv
from io import StringIO
from pyspark.sql import SparkSession
import io
connection_string = "AAAA"
container_name = "BBBB"
blob_service_client = BlobServiceClient.from_connection_string(connection_string)
container_client = blob_service_client.get_container_client(container_name)
import gnupg
gpg = gnupg.GPG()
gpg.encoding = 'utf-8'
passphrase = "12345"
secret = "112233"
def decrypt_file(filename, secret, passphrase):
gpg.encoding = 'utf-8'
with open(filename, 'rb') as f:
decrypted_data = gpg.decrypt(f, passphrase=passphrase)
if decrypted_data.ok:
print("done")
else:
print("error:", decrypted_data.status)
print("error:", decrypted_data.stderr)
return str(decrypted_data)
blob_client = container_client.get_blob_client(file)
blob_file_tinb = blob_client.download_blob()
tinb_file = decrypt_file(blob_file_tinb,secret,passphrase)
You can use the code below to read the CSV file from Azure Blob Storage (encrypted to decrypted) using the Azure Python SDK:
Code:
The above code downloads an encrypted file from Azure Blob Storage, decrypts it using
GnuPG, and prints the decrypted data.It sets up the
connection string,container name, and creates a BlobServiceClient and ContainerClient object. It also sets up the GnuPG object and passphrase to decrypt the file. Finally, it gets the BlobClient object for the encrypted file, downloads the file data, and passes it to the decrypt_file() function to decrypt and print the data.Output: