Offsets for Kafka Direct Approach in Spark 1.3.1

Question

Offsets for Kafka Direct Approach in Spark 1.3.1

396 views Asked by joebuild At 11 June 2015 at 17:47

I am implementing the 'direct' approach for kafka streaming in Spark 1.3.1 https://spark.apache.org/docs/1.3.1/streaming-kafka-integration.html As I understand it, there are two ways that the 'auto.offset.reset' can be set: "smallest", and "largest". The behavior that I am observing (and let me know if this is to be expected) is that the "largest" will start fresh and receive any new incoming data - while the "smallest" will start from 0 and read to the end, but won't receive any new incoming data. Clearly it would be preferable to be able to start from the beginning and also receive new incoming data. I did see the access (in the docs) to the offsets that each batch is consuming, but I'm not sure how that could be helpful here. Thanks.

Original Q&A

There are 1 answers

**joebuild** · Accepted Answer · 2015-06-11T18:03:46+00:00

joebuild On 11 June 2015 at 18:03 BEST ANSWER

It looks like I was mistaken - the 'smallest' actually does continue to read from the end for new/incoming data.

TechQA.

Offsets for Kafka Direct Approach in Spark 1.3.1

There are 1 answers

Related Questions in APACHE-SPARK

Related Questions in SPARK-STREAMING

Related Questions in APACHE-KAFKA

Popular Questions

Trending Questions