AmazonS3 - connecting with Python Boto according specific permissions

756 views Asked by At

I am trying to connect Amazon S3 via Boto 2.38.0 and python 3.4.3. The S3 account is owned by another company and they grants just these permissions :

"Statement":
[
    {
        "Effect": "Allow",
        "Action": "s3:ListBucket",
        "Resource": "arn:axs:s3:::GA-Exports",
        "Condition":{
            "StringLike":
            {
                "s3.prefix": "Events_3112/*"
            }
        }
    },{
        "Effect": "Allow",
        "Action": 
        [
            "s3:GetObject",
            "s3.GetObjectAcl",
            "s3.GetBucketAcl"
        ],
        "Resource": "arn:axs:s3:::GA-Exports/Events_3112/*",
        "Condition": {}
    }
]

I can connect and retrieve a specific file if I set the name. But I need to retrieve all data from S3 (for example to determine -through a script- which files I have not yet downloaded).

from boto.s3.connection import S3Connection
from boto.s3.connection import OrdinaryCallingFormat
s3_connection = S3Connection(access_key, secret_key,calling_format=OrdinaryCallingFormat())
bucket = s3_connection.get_bucket(__bucket_name, validate=False)
key = bucket.get_key(file_name)

works, but

all_buckets = s3_connection.get_all_buckets()

raise an error

S3ResponseError: S3ResponseError: 403 Forbidden
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>19D20ADCFFC899ED</RequestId><HostId>eI4CzQqAvOnjcXJNZyyk+drFHjO9+yj0EtP+vJ5f/D7D4Dh2HFL3UvCacy9nP/wT</HostId></Error>

With the software S3 Browser, I can right click > "export file list", and get what I need. But how can I do this in python ?

EDIT : Finally found the answer :

bucket_name = 'GA-Exports'
s3_connection = S3Connection(access_key, secret_key, calling_format=OrdinaryCallingFormat())
bucket = s3_connection.get_bucket(bucket_name, validate=False)
for key in bucket.list(prefix='Events_3112/DEV/'):
    print(key.name, key.size, key.last_modified)

Thanks for your help! :)

1

There are 1 answers

4
Iosu S. On BEST ANSWER

You won't be allowed to get all buckets, permissions says that you are allowed to list bucket contents only for "GA-Exports":

from boto.s3.connection import S3Connection
from boto.s3.connection import OrdinaryCallingFormat
# this is to avoid a 301 mover permanently when used OrdinaryCallingFormat
if '.' in __bucket_name:
    conn = S3Connection(access_key, secret_key, calling_format=OrdinaryCallingFormat())
else:
    conn = S3Connection(access_key, secret_key)

bucket = conn.get_bucket(__bucket_name, validate=False)
l = bucket.list(prefix='Events_3112/') # now l is a list of objects within the bucket
# other option is to use bucket.get_all_keys()
for key in l:
    print l # or whatever you want to do with each file name
    # Recall this is only the filename not the file perse :-D

see complete bucket object reference in http://boto.readthedocs.org/en/latest/ref/s3.html#module-boto.s3.bucket

Edit: added a fix when a 301 moved permanently error is received when accessing S3 via ordinarycallingformat. Added @garnaat comment on prefix aswell (thx!)