I have a graph database stored on AWS Neptune, which I need to query with gremlin within a Jupyter IPython notebook. I am applying the graph-neural-networks functionalities offered by Neptune ML for a link prediction task. Specifically, I want to predict which nodes of "TYPE_X" are related to the ones saved in my variable "id_variable".
My query looks like this:
%%gremlin
g.with("Neptune#ml.endpoint","${endpoint}").
V(${id_variable}).
project('name', 'related to').
by('name').
by( out('RELATED_TO').with("Neptune#ml.prediction").
hasLabel('TYPE_X').values('name') ).
order(local).by(keys, desc)
which returns the following output:
{'name': 'AANAT', 'related to': 'WDR7'}
{'name': 'ACACA', 'related to': 'BTN1A1'}
{'name': 'ACTA1', 'related to': 'MDH'}
{'name': 'ALAS1', 'related to': 'WDR7'}
{'name': 'ALAS2', 'related to': 'TAC3'}
{'name': 'ALDH2', 'related to': 'SOCS2'}
{'name': 'ALDOA', 'related to': 'PRKAB2'}
{'name': 'AKR1B1', 'related to': 'ODF2L'}
{'name': 'ALOX15', 'related to': 'BMP15'}
My problem is that this output is showed as embedded in the output of the notebook cell; however, I would like either to assign it to a variable or store it into a file, as a JSON for instance. In fact, I cannot do variable assignment with the %%gremlin
cell magic, and so far I have not found any way to write the output to a file.
Please note that I was not able to run this query in a normal .py script by means of the gremlin_python library, as it does not seem to support the ML functionalities of Neptune (specifically, it throws an error on the .with("Neptune#ml.endpoint","${endpoint}")
syntax).
Any suggestion is more than welcome!
Thank you in advance.
Have you tried using --store-to (or -s) param – Specifies the name of a variable in which to store the query results. ?
and check the results variable in next cell