Django copy pdf to s3bucket

1.1k views Asked by At

I am in trouble saving my local file to s3bucket.

I have a corn job in my Django project, after a certain time, it generates a pdf file. And I want to save the file in s3bucket.

Currently, Django s3bucket is working very well like saving my uploaded file to s3bucket and lots more thing is working.

But I am not getting how to copy local file and save in s3bucket.

CURRENTLY I AM SAVING THIS IN MY LOCAL MACHINE LIKE THIS WAY:

shutil.copyfile('/var/www/local.pdf' ,'media/newfileins3bucket.pdf')

But it will not works that way i want to save it directly to s3bucket.

Can anyone help me in this case?

I am using this and it has no point to save directly pdf to s3bucket: https://django-storages.readthedocs.io/en/latest/backends/amazon-S3.html

2

There are 2 answers

0
robin On

There are several ways to do this but I think one of the following should work for you.

Note: I'm assuming (as you mentioned) you have Django Storages with the S3 backend set up in your settings as the default storage.

Upload using FileField on a model

If you have a model set up that saves a references to the generated report, you can do something like this:

from django.db import models 
from django.core.files import File

class Report(models.Model):
   # this links to the S3 bucket if you use the correct Django Storages backend
   report_file = models.FileField()

# in your cron script, when your report is generated at '/var/www/local.pdf'
local_file = open('/var/www/local.pdf', 'rb')
report = Report()
# this uploads the context of file to s3, also saves to database
report.report_file.save('media/newfileins3bucket.pdf', File(local_file))

Note that you have to wrap your local file in a Django File object.

Calling save() on the file field automatically saves the model in the database as well, unless you add save=False to the call. For more info see the documentation on FileField.save()

Direct upload without a model

If you just want to upload the file to S3 without saving it in a model, you can do something like this:

from django.core.files.storage import default_storage
local_file = open('/var/www/local.pdf', 'rb')
# default_storage will be the S3 storage if you set use Django Storage with S3 backend in your settings
with default_storage.open('media/newfileins3bucket.pdf', 'wb') as target:
    target.write(local_file.read())

Disclaimer: I have used something similar but have not tested the exact code above. It should point you in the right direction though.

1
owais mushtaq On
from copy import deepcopy
s3 = boto3.client('s3',
                  region_name=""#put region here,
                  aws_access_key_id=aws_access_key_id,
                  aws_secret_access_key=aws_secret_access_key)


files = request.FILES.getlist('file')  # get all files
for file in files:
    deep_file = deepcopy(file)
    status, aws_file_path = upload_to_aws(deep_file)
    if status == api_status.HTTP_200_OK:
        reference_id = [aws_file_path]
        logger.debug("AWS_STORAGE file path {}".format(reference_id))
        message = "Uploaded Successfully"
    else:
        message = "COULD NOT CONNECT AWS"
        status_api = status
        return HttpResponse({}, status=status_api)


def upload_to_aws(file):
    global s3
    try:
        is_bucket = check_is_bucket_present()
    except botocore.exceptions.NoCredentialsError:
        logger.debug("Unable to locate credentials for AWS")
        return api_status.HTTP_500_INTERNAL_SERVER_ERROR, {}
    if not is_bucket:
        try:
            bucket = s3.create_bucket(Bucket=settings.AWS_BUCKET)
        except botocore.exceptions.ClientError as e:
            logger.debug("AWS_ Error while Creating Bucket {} : ".format(str(e)))
            return api_status.HTTP_500_INTERNAL_SERVER_ERROR, {}
    file_name_uuid = uuid.uuid4().hex[:20]
    folder_name = ''.join(file_name_uuid)
    try:
        # this does not return anything so try except
        file_path = str(folder_name + "/resume/" + str(file.name))
        # todo need to look into uploading via client
        # GB = 1024 ** 3
        # config = TransferConfig(multipart_threshold=5 * GB)
        # s3.upload_file('result1.csv', bucket_name, 'folder_name/result1.csv', Config=config)
        # was working with path but not with inmemoryobject
        s3 = boto3.resource('s3',
                            region_name="us-------"#put region here,
                            aws_access_key_id=aws_access_key_id,
                            aws_secret_access_key=aws_secret_access_key
                            )
        s3.Bucket(settings.AWS_BUCKET).put_object(Key=file_path, Body=file)
        file_path_bucket = settings.AWS_BUCKET + "/" + file_path

        return api_status.HTTP_200_OK, file_path_bucket
    except botocore.exceptions.ClientError as e:
        logger.debug("AWS_STORAGE Error {}".format(str(e)))
        return api_status.HTTP_500_INTERNAL_SERVER_ERROR, {}
    except Exception as e:
        logger.debug("AWS_STORAGE Error {}".format(str(e)))
        return api_status.HTTP_500_INTERNAL_SERVER_ERROR, {}