Seems not a Kinesis client library application for spark plus kinesis integration

588 views Asked by At

I am going through the code https://github.com/apache/spark/blob/master/extras/kinesis-asl/src/main/java/org/apache/spark/examples/streaming/JavaKinesisWordCountASL.java

it shows how kinesis can emit stream data to SparkContext and then spark can process further.

in the given example code i am trying to understand how this code is KCL application if i see http://docs.aws.amazon.com/kinesis/latest/dev/kinesis-record-processor-implementation-app-java.html it says You must complete the following tasks when implementing an Amazon Kinesis application in Java:

Tasks

Implement the IRecordProcessor Methods Implement a Class Factory for the IRecordProcessor Interface Modify the Configuration Properties

but the spark example code https://github.com/apache/spark/blob/master/extras/kinesis-asl/src/main/java/org/apache/spark/examples/streaming/JavaKinesisWordCountASL.java has no reference for IRecordProcessor and worker etc.

Note: https://spark.apache.org/docs/1.2.0/streaming-kinesis-integration.html under deploying section it says A single Kinesis input DStream can read from multiple shards of a Kinesis stream by creating multiple KinesisRecordProcessor threads. But there is no implementation of KinesisRecordProcessor is it missing . Or i am missing something obvious to understand

Could somebody please explain me how this is KCL application ?

1

There are 1 answers

2
ChristopherB On BEST ANSWER

The kinesis streaming implementation takes care of those interactions and abstracts away from the app. See https://github.com/apache/spark/tree/master/extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis