I understand that there are properties like CRUNCH_BYTES_PER_REDUCE_TASK or mapred.reduce.tasks to set number of reducers.
Can anyone suggest on configuring / overriding the default reducers for a particular Dofn which is taking more time to execute.
Reducers can be configured for particular DoFn by using the
ParallelDoOptions
and passing this as a 4th argument inparallelDo
like this:and pass this in
parallelDo
as 4th parameter.