Python read files from s3 bucket

3.1k views Asked by At

I'd like to read the .csv and text.txt file as two inputs for my function without passing the name of the file explicitly as i will have multiple csv and text and like to loop over them.. Below is the code that I have used

s3 = boto3.resource('s3')

bucket = s3.Bucket('textractpipelinestack-documentsbucket9ec9deb9-1rm7fo8ds7m69')

for obj in bucket.objects.all():
    key = obj.key
    body = obj.get()['Body'].read()
    print(key)

The print(key) gives me names of the files but i'm not sure how to read them so as to pass them as input.. I've attached an image of the print(key) and would like to read "tables.csv" and "text.txt" .Can anyone help?..

enter image description here

1

There are 1 answers

0
John Greenfield On BEST ANSWER

The following will read file content from any csv or txt file in the S3 bucket. You could build out logic to capture the data for input where I've created the print statement.

file_list = [f for f in bucket.objects.all() if f.key[-3:] == 'csv' or f.key[-3:] == 'txt']

for file in file_list:
    print(file.get()['Body'].read().decode(encoding="utf-8", errors="ignore"))