We often use GraphBLAS for graph processing so we need to use the incidence matrix. I haven't been able to find a way to export this from Grakn to a csv or any file. Is this possible?
Method to export Incidence Matrix from Grakn?
117 views Asked by Brian Dolan AtThere are 2 answers
So the incidence matrix is equivalent to the nodes/edges array used in force graph visualisation. In this case it is pretty straight forward.
My approach would be slightly different than the above as all i need to do is pull all of the things in the db (entities, relations, attributes), with
match $ting isa thing;
Now when i get my transaction back, for each $ting I want to pull all of the available properties using both local and remote methods if I am building a force graph viz, but for your incidence matrix, I really only care about pulling 3 bits of data:
- The iid of the thing
- The attributes the thing may own.
- The roles the thing owns if it is a relation
Essentially one tests each returned object to find out the type (e.g. entity, attribute, relation), and then uses some of the local and remote methods to get the data one wants. In Python, the code for pulling the data for relations looks like
# pull relation data
elif thing.is_relation():
rel = {}
rel['type'] = 'relation'
rel['symbol'] = key
rel['G_id'] = thing.get_iid()
rel['G_name'] = thing.get_type().get_label().name()
att_obj = thing.as_remote(r_tx).get_has()
att = []
for a in att_obj:
att.append(a.get_iid())
rel['has'] = att
links = thing.as_remote(r_tx).get_players_by_role_type()
logger.debug(f' links are -> {links}')
edges = {}
for edge_key, edge_thing in links.items():
logger.debug(f' edge key is -> {edge_key}')
logger.debug(f' edge_thing is -> {list(edge_thing)}')
edges[edge_key.get_label().name()] = [e.get_iid() for e in list(edge_thing)]
rel['edges'] = edges
res.append(rel)
layer.append(rel)
logger.debug(f'rel -> {rel}')
This then gives us a node array, which we can easily process to build an edges array (i.e. the links joining an object and the attributes it owns, or the links joining a relation to its role players). Thus, exporting your incidence matrix is pretty straightforward
There isn't a built-in way to dump data to CSV in Grakn right now. However, we do highly encourage our community to contribute open source tooling for these kinds of tasks! Feel free to chat to use about it on our discord.
As to how it can be done, conceptually it's pretty easy:
Query to get stream all hyper-relations out:
and then for each relation, we can pipeline another query (possibly in new transaction if you wish to keep memory usage lower):
which will get you everything in this particular hyper relation
$r
playing a role.If you also wish to extract attributes that are attached to the hyper relation, you can use the following
In effect we can use these steps to build up each column in the
A
incidence matrix.There are a couple if important caveats I should bring up:
==> It would be interesting to hear/discuss how one could encode types information for use in GraphBLAS
E
will also be present in the setV
. In practice only a few relations will play a role in other relations, so only a subset ofE
may be inV
.