I am experimenting around with Giraph. To run the algorithms in Giraph I need the graph data to be in Vertex Input Format. Almost all the available Big Data online is in Edge List Format. I wrote a code in Java to convert this Edge List format into VertexInputFormat. This works for smaller graphs with almost 800k edges. However for the graph that I need, every time I run the program its giving me Heap space exceeded error. I tried increasing the Heap size to maximum. Still the error persisted.
The file on which I am running is about 15GB in size.
I don't know much about how the algorithms(PageRank, SingleSourceShortestPath etc..,) are written in Giraph but I do know that they all take a graph in VertexInputFormat as input.
The help I am looking for is:
- An optimized code to convert EdgeInputFormat to VertexInputFormat (or)
- Any Online tool that can help in this conversion (or)
- PageRank algorithm that takes EdgeInputFormat as input.
Sorry, I didn't get the point on why you want to use the VertexInputFormat only, Giraph also provides EdgeInputFormat API, why can't you use that?