I have spark streaming process with batch interval of 30 seconds. If the batches pile up in the queue, I want to completely skip new batches created or dynamically change the batch interval to reduce the frequency with which the batch is created. I understand backpressure can be used to dynamically update the rate at which records are processed in each batch. What I am looking for is one of the below

  • Introduce a threshold for number of batches in the queue, if the no. of batches in the queue exceeds threshold then don’t add new batches
  • Dynamically update batch interval so that when the batch queue size is beyond certain limit, I can increase batch interval so that fewer batches get added to the queue.

Is there any way I can achieve my requirement?

Anonymous Asked question May 14, 2021