Script to take Row Attachments from Smartsheet with API and save to a folder in the DataBricks DBFS is not working (with no errors)

47 views Asked by At

I am creating a Python script that takes the row by row attachments on a given Smartsheet and saves them to a folder in the DataBricks DBFS. The logic is that the script would save the attachments into subfolders named the email address of the user that uploaded the attachment.

The code runs successfully with no errors but there is no output in DBFS. Any guidance would be appreciated thanks!

# Import required libraries
import os
import smartsheet

# Initialize Smartsheet client
SMARTSHEET_ACCESS_TOKEN = dbutils.secrets.get(scope="***********", key="smartsheetapi")
smartsheet_client = smartsheet.Smartsheet(SMARTSHEET_ACCESS_TOKEN)

# Get Smartsheet attachments and store in DBFS
SHEET_ID = ******************
sheet = smartsheet_client.Sheets.get_sheet(SHEET_ID)

# Define the master folder in FileStore of DBFS
master_folder = os.path.join('/dbfs/FileStore', 'smartsheet_attachments')

for row in sheet.rows:
    if row.attachments:
        for attachment in row.attachments:
            response = smartsheet_client.Attachments.get_attachment(SHEET_ID, attachment.id)

            # Create a subfolder for the user who uploaded the attachment
            user_folder = os.path.join(master_folder, attachment.created_by.email)
            if not os.path.exists(user_folder):
                os.makedirs(user_folder)

            # Save the attachment to the user's folder in DBFS
            with open(os.path.join(user_folder, attachment.name), 'wb') as file:
                file.write(response.stream.read())

            # Log the user who uploaded the attachment
            print(f'User {attachment.created_by.email} uploaded the attachment {attachment.name}')

NOTE: *** used to hide private information

1

There are 1 answers

0
Kim Brandl On

As the Smartsheet API docs describe, the Get Attachment operation returns a JSON response. Within the JSON response, the url property contains a (short-lived) URL where the file can be accessed.

Because the Get Attachment response doesn't contain the contents of the file, it cannot be used to write the file like you're trying to do (e.g., file.write(response.stream.read())). Instead, you'll need to use the url value in the response to get the contents of the file. I'm no Python expert, but it seems like you can use the urlopen function from urllib to do this. For example: data = urlopen('http://example.com').read().