Connecting the first two nodes with an edge from two RDDs in GraphX

129 views Asked by At

I am using GraphX for the first time and I want to build a Graph incrementally. So I need to connect the first two nodes to an edge knowing that I have 2 RDDs (each one has a single value):

firstRDD: RDD[((Int, Array[Int]), ((VertexId, Array[Int]), Int))]
secondRDD: RDD[((Int, Array[Int]), ((VertexId, Array[Int]), Int))]  

I want to connect the first VertexId with the second one. I appreciate your help

1

There are 1 answers

1
David Griffin On BEST ANSWER

Basically, you use map and case statements to pick out the VertexIds, then, use RDD.zip to stitch them together, then another map to create the final EdgeRDD:

firstRDD.map{ 
  case ((junk1,junk2), ((vertex1, junk3), junk4)) => vertex1
}.zip(
  secondRDD.map{
    case ((junk1,junk2), ((vertex2, junk3), junk4)) => vertex2 
  }
).map{ case(vertex1, vertex2) => Edge(vertex1, vertex2, 0) }