Is it possible to mock a RDD without using sparkContext?
I want to unit test the following utility function:
def myUtilityFunction(data1: org.apache.spark.rdd.RDD[myClass1], data2: org.apache.spark.rdd.RDD[myClass2]): org.apache.spark.rdd.RDD[myClass1] = {...}
So I need to pass data1 and data2 to myUtilityFunction. How can I create a data1 from a mock org.apache.spark.rdd.RDD[myClass1], instead of create a real RDD from SparkContext? Thank you!
I totally agree with @Holden on that!
Mocking RDDS is difficult; executing your unit tests in a local Spark context is preferred, as recommended in the programming guide.
I know this may not technically be a unit test, but it is hopefully close enough.
But if you are really interested and you still want to try mocking RDDs, I'll suggest that you read the ImplicitSuite test code.
The only reason they are pseudo-mocking the RDD is to test if
implict
works well with the compiler, but they don't actually need a real RDD.And it's not even a real mock. It just creates a null object of type RDD[T]