Near real time streaming data from 100s customer to Google Pub/Sub to GCS

243 views Asked by At

I am getting near-real time data from 100s of customers. I need to store this data in Google Cloud Storage buckets created for each customer i.e. /gcs/customer_id/yy/mm/day/hhhh/

My data is in Avro. I guess I can use Pub/Sub to Avro Files on Cloud Storage template. However, I'm not sure if Google Pub/Sub can accept data from multiple customers. Appreciate any help here, thanks!

1

There are 1 answers

0
guillaume blaquiere On BEST ANSWER

The template is quite simple: it takes all the data of PubSub and store them in an avro file on GCS.

However, it's a good starting point and you can make evolutions on that base to add a split per customer, and the file path that you want.

You can find the template in Java format on GitHub