Python script to count the number of records in a file on Google cloud bucket

1.2k views Asked by At

Can you please help on providing python script to capture count of records in a file that is on GCS. Im trying to connect from linux server to GCS Bucket and capture the count of records/size of file.

1

There are 1 answers

1
Zeenath S N On

I am using the following script and it’s working for me, I hope you can get an idea through this on how to do it.

import os
from flask import Flask
from google.cloud import storage
 
app = Flask(__name__)
 
 
storage_client = storage.Client()
file_data = 'file_name'
bucket_name = 'bucket_name'
temp_file_name = 'file_ma,e'
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.get_blob(file_data)
blob.download_to_filename(temp_file_name)
 
temp_str=''
with open (temp_file_name, "r") as myfile:
   for count, line in enumerate(myfile):
       pass
print('Total Lines', count + 1)
 
if __name__ == "__main__":
   app.run(debug=True,host='0.0.0.0',port=int(os.environ.get('PORT', 8080)))
 

I have first downloaded the file in my environment using download_to_filename() and then later I have read the file using open(). I have used enumerate() inside the for loop that adds a counter, you can read more about the enumerate here.