I'm trying to make a network graph using networkX that is given the nodes and attributes. Each node is unique but it can have matching attributes with other nodes. These attributes will act as the edges between the nodes that all have this same attribute.
An example of the input (node and attributes)
Name1 2-s2.0-84905590088, 2-s2.0-84901477890
Name2 2-s2.0-84941169876
Name3 2-s2.0-84958012773
Name4 2-s2.0-84960796474
Name5 2-s2.0-84945302996, 2-s2.0-84953281823, 2-s2.0-84944268402, 2-s2.0-84949478621, 2-s2.0-84947281259, 2-s2.0-84947759580, 2-s2.0-84945265895, 2-s2.0-84945247800, 2-s2.0-84946541351, 2-s2.0-84946051072, 2-s2.0-84942573284, 2-s2.0-84942280140, 2-s2.0-84937715425, 2-s2.0-84943751990, 2-s2.0-84957729558, 2-s2.0-84938844501, 2-s2.0-84934761065
Name6 2-s2.0-84908333808
Name7 2-s2.0-84925879816
Name8 2-s2.0-84940447040, 2-s2.0-84949534001
Name9 2-s2.0-84899915556, 2-s2.0-84922392381, 2-s2.0-84905079505, 2-s2.0-84940931972, 2-s2.0-84893682063, 2-s2.0-84954285577, 2-s2.0-84934934228, 2-s2.0-84926624187
Name10 2-s2.0-84907065810
so Name5
would have a lot of edges that connected up to the other names with the same identifier.
I'm not sure if this is the right idea behind networkX or if you can even use this kind of input to graph. If this way is not achievable, how would I format the input to make this graph? I haven't been able to find any documentation or videos on using networkX this way.
What you ask is possible. I stored your data in a csv file -- note that I added a
,
after the node names and that I removed all whitespace.One observation: you say that
Name5
would have a lot of edges but its attributes are unique. Moreover, when I run my code with your data it turns out all of the attributes are unique so there are no edges in the graph.I tweeked your data so that I use only the first 12 characters of each attribute (I do that with the line
new_attributes = [x[:12] for x in new_attributes]
). That way I get some matching attributes.Now the code:
For each csv row I add a node (with its attributes) to the graph and based on the current nodes in the graph (and their attributes) I add the edges. Note that the node attributes are stored in a list and can be accessed with the
my_attributes
key. In the end I also print the edges with the matching attributes for the nodes in a particular edge (I useset
and&
to get the intersection of two lists of attributes).Output for the tweeked data:
One final note: if you need to have multiple edges between two nodes use a
MultiGraph
.