How to track changes in Google Cloud public datasets?

266 views Asked by At

The google cloud service has a bunch of public datasets available in its cloud storage service. I would like to track updates / additions to some of these public datasets. I.e. to create some kind of webhook when new files are added to the public data buckets.

I read about google pubsub notifications, the possibility of creating pubsub topics that push change notifications on buckets.

However, I could not figure out if such topics exists already on the public datasets that I could subscribe to, or how to create such a topic based on the public dataset buckets.

Is there any way to to track changes on the public datasets, possibly using pubsub?

1

There are 1 answers

0
gso_gabriel On BEST ANSWER

You can try to list to changes perform to each individual bucket from the public datasets. For example, the dataset Landsat data has the bucket location as gs://gcp-public-data-landsat. As clarified in this official documentation here, you can watch a bucket by using the command gsutil notification watchbucket.

With this command and its parameters, you should be able to set the bucket you want to track the updates and where to send this data. An example of command that watches the bucket gcp-public-data-landsat for changes and send notifications to an application server running at example.com:

gsutil notification watchbucket https://example.com/notify gs://gcp-public-data-landsat

More information on the command notification can be found here.

I would recommend you to give it a try using this, as it seems to be the available option, as there isn't anything pre-set or configured to watch these datasets.