The google cloud service has a bunch of public datasets available in its cloud storage service. I would like to track updates / additions to some of these public datasets. I.e. to create some kind of webhook when new files are added to the public data buckets.
I read about google pubsub notifications, the possibility of creating pubsub topics that push change notifications on buckets.
However, I could not figure out if such topics exists already on the public datasets that I could subscribe to, or how to create such a topic based on the public dataset buckets.
Is there any way to to track changes on the public datasets, possibly using pubsub?
You can try to list to changes perform to each individual bucket from the public datasets. For example, the dataset
Landsat data
has the bucket location asgs://gcp-public-data-landsat
. As clarified in this official documentation here, you can watch a bucket by using the commandgsutil notification watchbucket
.With this command and its parameters, you should be able to set the bucket you want to track the updates and where to send this data. An example of command that watches the bucket
gcp-public-data-landsat
for changes and send notifications to an application server running at example.com:More information on the command
notification
can be found here.I would recommend you to give it a try using this, as it seems to be the available option, as there isn't anything pre-set or configured to watch these datasets.