How can I export Pub/Sub messages using a Protobuf schema to a GCS bucket?

54 views Asked by At

I'm currently publishing messages via a Pub/Sub topic that is using a Protobuf schema. This was working fine as my consumer was able to read and decode these messages using the aforementioned schema.

I now want to create an export subscription that instead reads these messages and writes them out to files in a GCS bucket. I want these files to contain those same protobuf messages that I can then read in via a Dataflow job to decode and process them.

I've been able to successfully set up the export subscription to write out files to the GCS bucket using two approaches:

  • Explicitly creating a Write to Cloud Storage subscription via Cloud Console (file format as text)
  • Creating a Dataflow job via the template Pub/Sub to Text Files on Cloud Storage

However, in both cases the files created produce unreadable content. When I download these files and try to decode manually using protoc (along the schema used to generate the records), I receive the message Failed to parse input..

Naturally, I think the pain point here is in writing out Protobuf messages as text files. Is there a way in which this set up can work while preserving the publisher writing out Protobuf messages?

0

There are 0 answers