Is there any form to reduce the quantity of messages read per second from PubSubIO?

157 views Asked by At

I have a cloud streaming pipeline that read from PubSubIO and which "PipelineOptions" are set with "WorkerMachineType = n1-standard-1". This machine have 3.75GB of memory.

My problem is that if the subscription has a lot of messages, the pipeline reads really fast and when starts to process many elements it doesn't have enough memory.

Is there any form to reduce the quantity of messages read per second? or is the memory consumption related with the time duration assigned to the window and I would reduce this time duration?

Thanks is advance.

1

There are 1 answers

0
Tyler Akidau On

It sounds like you may be trying to process too much data with too few workers. We are looking at addressing this and related scenarios, but in the meantime you may want to try dialing down the amount of data you're ingesting, or increasing the number of workers available to the jobs.

You'll also get better performance with n1-standard-4 machines, which is why we make those the default for the streaming runner.