I am completely new to Apache Spark and I trying to Cartesian product two RDD. As an example I have A and B like :
A = {(a1,v1),(a2,v2),...}
B = {(b1,s1),(b2,s2),...}
I need a new RDD like:
C = {((a1,v1),(b1,s1)), ((a1,v1),(b2,s2)), ...}
Any idea how I can do this? As simple as possible :)
Thanks in advance
PS: I finally did it like this as suggested by @Amit Kumar:
cartesianProduct = A.cartesian(B)
That's not the dot product, that's the cartesian product. Use the
cartesian
method:Source