I need to read a file from a GCS bucket. I know I'll have to use GCS API/Client Libraries but I cannot find any example related to it.
I have been referring to this link in the GCS documentation: GCS Client Libraries. But couldn't really make a dent. If anybody can provide an example that would really help. Thanks.
OK. If you want to simply read files from GCS, not as a PCollection but as regular files, and if you are having trouble with the GCS Java client libraries, you can also use the Apache Beam FileSystems API:
First, you need to make sure that you have a Maven dependency in your
pom.xml
onbeam-sdks-java-extensions-google-cloud-platform-core
which contains implementation of thegs://
filesystem:Then set up the FileSystems API (it is set up by default in all pipelines, but if you're using it outside a pipeline, you need to do it manually).
Then you can use it: